The featured image of this post is is a comic from xkcd.com.

The above xkcd comic, which is titled Debugger, alludes to the problem that when you try to apply a particular method to itself, you might not get what you asked for. Turing’s Halting problem is a very famous example for this, i.e., you cannot algorithmically decide whether an algorithm terminates on an input. So, does that problem apply to debuggers as well? In particular, I asked myself whether it makes sense to debug the hardware debugger I am developing with itself.

General Objections

The main argument against using a debugger to debug itself is that assuming that the debugger contains a bug, how can you be sure that this particular bug will not obscure the bug you are hunting for. And indeed, you cannot be sure.

On the other hand, we debug, of course, other programs knowing that the debugger we are using is not bug-free. For instance, when you have a look at the list of bugs in the GDB debugger, you will notice that this list has roughly 3000 entries. Fortunately, almost all of them are probably not relevant for your particular case, so you will use GDB nevertheless. And, of course, you can also debug GDB with itself. When doing such things, one should be aware of its limitations, though.

Limitations of applying a hardware debugger to itself

In my case, when developing the hardware debugger dw-link, one even has to debug only a small part of the overall debugging system. Nevertheless, when applying dw-link to itself, one should be aware of the limitations of doing so.

First of all, one has to be sure that the basic functionality of the hardware debugger can be trusted. This means that in the early stages of the development of the debugger one better applies other techniques.

Secondly, time-critical parts of the program are impossible to debug using the debugger. Take the lowest level of the debugWIRE protocol, the single-wire asynchronous serial communication over the RESET line. Debugging on this level can only be done by methods that do not (or only minimally) change the timing behavior. So, in this case a logic analyzer is of great help, as I have shown in the description of the development of the SingleWireSerial library. A similar comment applies to the next higher level, the exchange of commands and data with the target in order to read and write the memory and registers of the target. And the logic analyzer is the tool of choice here as well.

Once these parts are working reliably enough, one can, of course, employ another dw-link in order to debug dw-link. One can even stop in time-critical parts and inspect the machine state. One needs to be aware that a restart may be called for in this case, though. There is only one remaining issue one has to be aware of, namely, how to start up the two instances of dw-link.

Starting a (meta-level) debug session

When a serial connection to the hardware-debugger dw-link is established (when the GDB command target remote ... is executed), then dw-link or the target or both are resetted. Now, when we connect two GDB instances to two instances of the hardware debugger, we have to choose the right order to startup the entire system.

Before we dive into this issue, let us visualize the general setup.

Since we want to debug the object-level dw-link, we need to make sure that the RESET line does not have a capacitive load. In case it is an Arduino board, one needs to cut the RESET EN solder bridge. This implies that when connecting to the board, there will be no RESET impulse.

So, what happens when we start the object-level debugger and dw-link first? When the meta-level dw-link tries to establish a debugWIRE connection with the object-level dw-link, it will reset the object-level dw-link. And for this reason, the object-level dw-link will loose the debugWIRE connection to the target and the GDB RSP connection to the object-level debugger. So, this does not work at all.

Let’s try to start the meta-level debugger and dw-link first. Then the debugWIRE connection to the object-level dw-link will be established (resetting the object-level dw-link). If we now start the object-level dw-link by a meta-level debugger command, the object-level debugger can establish a connection to it and the object-level dw-link can establish a debugWIRE link to the target. This means, we are in business now.

I tried it out, and it works indeed!

However, it is a bit confusing. And one has to be careful not to stop at a breakpoint which is in the middle of an ongoing communication. I have yet to come to the point that I start the entire machinery to hunt a bug in the firmware. However, it is definitely less annoying than spreading print statements throughout the code.