No.1150
My experience is that debugging with a debugger is often tricky in multiprocess, multilanguage projects – you'll want to throw gdb around one particular C program, and pdb around one particular python script, and orchestrating everything so that those two programs get launched with the debuggers you want but everything else is left alone can be tricky. I'm lazy, so my "solution" here has always been to give up and fall back on printf-debugging.
Once you've gone to printf-debugging, the trick is to find the right spots to put your printfs. If you're lucky, it'll be pretty obvious from the start that the bug has to be within some smallish, more-or-less self-contained function. In that case, there's not scope for you to go wrong; just log the line number and the name and value of the variable of interest, and check the logs to see where things go bad.
The more thoroughly you understand the system you're debugging, the easier it'll be to narrow down "where the bug lives". In particular, debugging very often comes down to figuring out where some specific value "came from"; if you know which parts of the code actually modify that value, that makes tracking aberrations to their source much easier. When people talk up const-correctness in C or C++, or assign-once in Erlang, or immutable-by-default in Rust, this is part of the benefit they're hoping to gain: by mechanizing the knowledge of what values can be changed where, it becomes easier to be confident that there isn't some mutation you've missed.
Ordinarily, of course, we have a very imperfect understanding of the code we're debugging – if we had a really good understanding of it, there probably wouldn't be a bug in the first place. When that happens, the first order of business is to find some sort of solid ground to stand on. For example, perhaps you know that the variable "x" has a bogus value at line 112 of file "foo.c", but you don't really know anything else about it – not where it came from, not what's modified it, perhaps not even what "x" really means. The fact that you know that it's definitely somehow invalid at that specific location (under whatever conditions you need to follow to reproduce the problem) is still enough to get started: log the value of "x" at that particular line, then start scrolling up. If "x" was passed as a parameter to this function, grep around and figure out what code this function is called by. If, on the other hand, "x" is a local variable, then dig into how it is assigned. Is anything suspicious there? Add more printfs wherever you want more detail, and gradually – over multiple iterations, running your reproducer and examining the output it produces after each change – work your way backwards. Keep this up, and you'll eventually figure out where "x" comes from, and how its value goes bad. It's a slow, painful process, but it's reliable and it works even if you know very little about the structure of the code you're working with.
The iteration in the process above is important – it'll allow you to build up an understanding of the program as you get further into debugging it. That understanding will both guide your investigation going forward and inform your idea of what it means for a value of "x" to be "good" or "bogus". Without building up that understanding, it's easy to blindly skip over the point where "x" is set to some invalid value – you simply don't have the knowledge necessary to identify that value as bad.
Patience, to the point of simple bloody-mindedness, is your friend when investigating a difficult bug.
No.1152
I refuse to share my tips with someone who posts an image of that eyesore. Hope your job will get worse over time.
No.1155
>>1150I actually wondered about this recently: how come that today a debugger is still mostly a text editor with minimal annotations? For example, wouldn't it help to have some kind of data-flow view where you can easily track the places where a given object/variable is/could be modified across the program?
No.1159
>>1155On a technical level, something like that ought to be possible. Actually, it ought to be possible to do it even outside the context of a running process – for example, syntax hilighting in a text editor could be made to show the dataflow for a particular variable. I've seen a little bit of mulling[1] and work[2] around this sort of thing, but not a lot. It does seem like it would be helpful; not sure why it's still so rare.
[1]
https://visualstudiomagazine.com/articles/2014/08/01/semantic-code-highlighting.aspx[2]
https://www.jetbrains.com/help/idea/analyzing-data-flow.html No.1172
>>1159Compilers build data-flow graphs all the time, so it's certainly possible. The problem is that it is a lot of work and I don't know any compiler that make these graphs available for use. I wonder how jetbrains is doing it.
No.1173
>>1149any particular languages? My plan-of-attack is usually:
1 - create automated test that exposes bug (should succeed but doesn't)
2 - step through test
3 - find part that's wrong
4 - repeat with a more-focused test (e.g. if a single call elsewhere gave the wrong result, just test that call) until you've found the problem
5 - ensure all the tests written as part of this are added to CI, so this doesn't happen again
No.1231
>>1173It's a sweet plan, but in many cases the real work happens before step one. Even if the system under test is reliably failing, getting it to the failure can take serious time and anything before that point could have caused the problem. I've seen cases where between the cause and the actual detection of the failure eplased almost an hour.
No.1232
>>1231Another dimension here is when you just straight up don't have enough information to create a reproducer. Then you're reduced to jiggling this, poking that, trying this other thing, and thinking very hard about exactly what the system would have to do in order to behave in the way the person reporting the bug is describing. Succeeding under those sorts of circumstances feels pretty heroic.
No.1282
Do you think using techniques from mutation testing to randomly seed a program with bugs and then debugging those would be a valuable exercise or is it a waste of time?
No.1283
I debug soykafty test cases written in C running on soykafty, broken hardware. I can at least tell you the basics:
- Make sure optimization is turned OFF (-O0)
- Make sure debugging support is turned on (-g3)
- Make sure you have the source code of the program (use the 'directory' command in gdb to add locations to search for the source code)
- use GDB's text user interface ('gdb -tui')
- When all else fails and your world is still not sane, a printf & a recompile can be the most reliable tool in your bag.
No.1305
>>1282That seems like a waste of time. The kinds of bugs a mutation testing program will generate probably arn't the kinds of bugs that would provide useful debugging experience. Why not fix real bugs?
No.1310
>>1305I don't know any project that could use my help.