Bilou HomeBrew's Blog: impure

Tuesday, April 21, 2009

impure_data

Some bugs are definitely easier to track than others. And bugs that appear in standard libraries are definitely the trickiest you're likely to encounter, because you know little of what's inside the library, after all ^^"

This specific bug happened in _fread_r, the re-entrant version of fread, meaning that neither PC (current program location, within _fread_r) nor LR (register holding the last return, within fread) were of any help. If 10 years of C programming tought me one thing, it is that bugs are not in the standard libraries, but in how you invoke the library. Fread, strcpy and friends trust you to give them pointers where pointers are due and will not try to do anything fancy to ensure you're actually entitled to read/write to those memory locations. Once you got that wired in /dev/brain, you can start debugging.

Chers lecteurs francophones, ceci est un post technique sur la mise au point de programmes sur la console DS, comment interpréter le contenu de la pile, le désassemblage du code et tutti quanti. Si vous pensez avoir déjà les connaissances techniques (rudiments d'assembleur, registres, pile, adresses) mais que l'anglais vous bloque, laissez un commentaire et je traduirai...

As usual, my starting point is the guru meditation screen of the DS, but this time, registers are of little help: what really matters is the stack dump just below.

00000011 00000001 00000e0c 00000000 00000000 00000000 00000000 0b003bd6 00000001 00000011 00000000 020299c7 00000000 ffffffff 0b003c20 00000000 00000000 020362cf 0b003bd6 0200c721 2029894 : crash within _fread_r
_fread_r internal variables
_fread_r saving registers
20299c6 : call by fread
fread internal variables
fread saving registers
20362ce : call by FileRead::read
FileRead::read saving registers
200c720 call by XMtransport::load

Unfortunately such "stack unwinding" is more complicated on ARM cpus than on the x86 architecture, because parameters to function calls are typically kept in registers (r0..r7 at least) rather than pushed on the stack, and because there is nothing like the "base frame pointer". Knowing how many words on stack each function takes can only be deduced by disassembling the corresponding function with "arm-eabi-objdump -drl <file.arm9.elf>", e.g.


fread():
# address code   disassembled
20299ac: b570   push    {r4, r5, r6, lr}
20299ae: 1c16   adds    r6, r2, #0
20299b0: 4a07   ldr     r2, [pc, #28]   (20299d0 <.text+0x296d0>)
20299b2: 1c0d   adds    r5, r1, #0
20299b4: b082   sub     sp, #8
20299b6: 1c04   adds    r4, r0, #0
20299b8: 1c21   adds    r1, r4, #0
20299ba: 6810   ldr     r0, [r2, #0]
20299bc: 9300   str     r3, [sp, #0]
20299be: 1c2a   adds    r2, r5, #0
20299c0: 1c33   adds    r3, r6, #0
20299c2: f7ff ff4f       bl      2029864 <_fread_r>
20299c6: b002   add     sp, #8
20299c8: bc70   pop     {r4, r5, r6}
20299ca: bc02   pop     {r1}     ; retrieve LR
20299cc: 4708   bx      r1       ; return
20299ce: 0000   lsls    r0, r0, #0
20299d0: ebe8 0204       undefined

teaches us that fread saves 4 registers on the stack before it starts executing and that it needs 8 bytes of local storage for its own use. The order of the arguments in push commands is a bit confusing but it works as such: lr will be pushed first, then r6, then r5 and r4 will show at the top of the stack when the processor will be ready to execute next instruction at 20299ae. You shouldn't be confused by the fact that sub sp, #8 is "reserve 8 bytes for local variables": the stack grows downwards in virtually every CPU architecture.
Now, i have to admit that this is quite a tedious way to go, so you're more likely to just check the output of arm-eabi-objdump -h arm9/runme.arm9.elf :

Idx Name          Size      VMA       LMA       File off  Algn
0 .init         000002dc  02000000  02000000  00008000  2**4
           CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .text         00040520  02000300  02000300  00008300  2**6
           CONTENTS, ALLOC, LOAD, READONLY, CODE
2 .fini         0000000c  02040820  02040820  00048820  2**2
           CONTENTS, ALLOC, LOAD, READONLY, CODE

that tells us simple and basic that everything between address 02000300 and 02040820 is our code. If you spot any value within that range on your stack dump, it's very likely to be a value of lr that has been pushed and you can just call arm-eabi-addr2line to figure out where it is in your program. Note, however, that addr2line doesn't manage to extract function names out of the standard library component, which will just be refered to as "crtstuff:0" ... however, as soon as your code is reached, you usually know by reading line 95 of FileRead::read that the library function you're calling is actually fread...

But what actually puzzled me in this debug session was a suspicious data address that objdump resolved into "impure_data". It did not appeared on the program map (build/.map) and was co-located with _impure_ptr in lib_a-impure.o ... As usual, there are tons of bug reports that can be found by google about impure_data, but little clue. So i downloaded the sources of newlib (the libc used in the devkitpro project) and quite immediately located newlib/libc/reent/impure.c and

struct _reent __ATTRIBUTE_IMPURE_DATA__ _impure_data = _REENT_INIT(_impure_data); struct _reent *__ATTRIBUTE_IMPURE_PTR__ _impure_ptr = &_impure_data;

Of course, i should have guessed that right from the start: "impure" is just a nickname for the "reentrant datastructure" of the C library in its newlib incarnation, following the precept that "A pure function is one with no side effects; an impure function is any other". Data that must be kept apart of functions so that e.g. you can call strtok within an interrupt handler even if strtok has been itself interrupted.

Btw, Peter Schraut wrote a nice blog entry about DS "guru mediation" and how to handle them.

2 comments:

PypeBros said...: A noter:
- la valeur de LR est une adresse de _retour_, donc l'instruction suivant l'appel et pas l'appel lui-même
- ici, LR a généralement une valeur _impaire_ bien que les instructions fassent toutes 2 ou 4 bytes. En réalité, le bit #0 indique que le code est en mode "Thumb", qui sacrifie un peu l'expressivité du processeur ARM contre une meilleure densité du code.

http://www.simplemachines.it/doc/ARM_COMBO_ap01.html; 2:38 pm
PypeBros said...: J'étais perplexe avec toutes ces instructions ARM qui se terminent par un S. adds, movs, lsls ... Il ne s'agit pas d'un indicateur sur la taille de l'opération mais d'un flag qui précise que le contenu du "status register" doit être modifié.

En fait, en mode "thumb", il n'y a généralement pas moyen d'interdire à une intruction de modifier les flags, ou la forcer à le faire dans les cas où elle est programmée pour ne pas y toucher.; 3:52 pm

Bilou HomeBrew's Blog

Tuesday, April 21, 2009

impure_data

2 comments:

About / Welcome

Bilou

Blog Archive

Tags Tower

Partenaires & Partners

Support this Blog

Author

Projects

More Downloads.

Wikipedia-Power!

Pixel'ing ...

blogs i watch

Links

Liens

Congratulations !

Been Watching ...

- Merci -