Fixing Problems … Using Super-Global Variables

Another xkcd comic that hits the spot. Except, with my new hardware debugger, this is the past 😎. Recently, I debugged one of my electronic geocaching gadgets and was positively surprised how easy it was to figure out ones own mistakes and to come up with the right fix.

I built a little box on which you can play Tetris. It looks like as in the following picture and uses cool touch buttons based on the Sparkfun’s Capacitive Touch Keypad – MPR121.

While it usually worked well, sometimes it did not react to button touches. The problem was that sometimes when initializing the MPR121, the initial calibration is wrong, for instance, if you put fingers on the touch buttons while the MPR121 calibrates.

To mitigate this problem, I put restart code in the firmware that led to recalibration. In order to have a complete reset, I use the method suggested by Atmel, namely, to activate the watchdog timer for a very short period and then start an infinite loop. This looks as follows.

#include <avr/wdt.h>

...
wdt_enable(WDTO_15MS);
while (1);

Now you have to make sure that you do not end up in a restart-loop because this is annoying to the user and will also deplete the battery. This can be achieved by having a variable that survives resets. Such variables can be declared as follows.

unsigned char restarts __attribute__ ((section (".noinit")));

I call such variables super-global because their lifetime appears to be longer than the lifetime of ordinary global variables. Since they are not initialized after a program reset, you have to make sure that they receive a reasonable value when the MCU is powered up. Basically, there are two ways to do that. Either, you have a special 4-byte variable in which you write a “magic” value, which signals that the MCU has been powered up.

#define MAGICVALUE 0xA6B3CCF1UL
unsigned char restarts __attribute__ ((section (".noinit"))); 
unsigned long magic __attribute__ ((section (".noinit"))); 

void setup () {
  if (magic != MAGICVALUE) {
   magic = MAGICVALUE; // signal that initialization has been done
   restarts = 0;       // initialize super-global variable
  }
...
}

A more direct way is to check the reset reason in the MCUSR register immediately after the program starts. This, however, works only if you do not use a bootloader. If you are not starting your program with a bootloader, you need a watchdog timer initialization function in any case, because otherwise, you can end up in WDT restart loops. So, here we go:

void wdt_init(void) __attribute__((naked)) __attribute__((section(".init3"))) __attribute__((used));
void wdt_init(void) {
  if (MCUSR & _BV(PORF)) // reset reason is power-up reset
    restarts = 0;
  MCUSR = 0;
  wdt_disable();
}

Now one only has to check the number of restarts and then issue an error message when it gets too high. I had tried out similar things before, and they always worked. This time, resetting the variable to 0 after an error message was issued did not work, though. So, I fired up my hardware debugger dw-link and placed a number of breakpoints, checked the value of the restart variable and pretty easily found the misconception I had about my own code (that I had written one year ago). So this was much less painful than placing print statements all over the place, recompiling and so on.

What was most amazing, though, was the fact that the debugger worked much better than I had expected. From Atmel’s list of known issues of the AVR JTAGICE mkII Debugger for debugWIRE, I copied the warning that “BOD and WDT resets lead to loss of connection” into the dw-link manual. Well, it turned out that a target reset does not affect the connection to the debugger at all. When the MCU is reset, it still is in debugWIRE mode and will execute normally after the reset. When one then stops the MCU asynchronously with Ctrl-C or the MCU hits a breakpoint, then it sends a break condition and the letter ‘U’, which is used by the debugger to re-synchronize to the MCU. The only slight problem is that the MCU forgets the hardware breakpoint after a reset. Since this hardware breakpoint is used as a ‘joker’ by the debugger, one can never be sure, which of the set breakpoints uses the hardware breakpoint. So, if you want to be sure to stop after a reset in the initialization routine, you have to set two distinct breakpoints in the setup routine.

Most probably, a couple of other warnings in the list of known issues are also baseless. Changing the OSCCAL value, which changes the MCU clock frequency, for instance, should be unproblematic as well, because the debugger always re-synchronizes after a stop. Most probably, the AVR JTAGICE mkII Debugger was simply less sophisticated.

All in all, it was probably the first time doing something new with dw-link and avr-gdb that did not result in finding another problematic spot (such as jumps into ISRs when single-stepping, optimizing away important debugging information, or other funny surprises), but finding out that the debugger covers more ground than expected.

Leave a Reply Cancel reply