Remember of Apple Assault's frenzy, and how you could end up not seeing that appleman that runs towards you because it's actually sneaking behind a row of stunned applemen ? That's never been fixed to date because I had no way to instruct the game engine that "those stunned applemen should be moved to the background".
Then I remembered of that "talk" by Rafael Baptista about resource management on the GBA. Before he entered the core meat of his talk, he suggested that copying OAM entries (you can think of them as sprite descriptors) could be the right time to follow a Z-order linked list and keep the "hardware" OAM entries and their "shadow" counterpart. Nice move, although I still wonder how he manages the scaling/rotation matrices along the way...
I used to have a large, one-chunk DMA transfer of all the sprites & rotation information. I would have to split that in 3-u16 slices. That sounds like I'll have to pay sverx's report on memory copies performance benchmark a second - and more careful - read... and possibly opt for an intermediate - sorted, but in-cache - version out of the vblank period and then DMA that into VRAM when the vblank is hit.
A final consideration: there is little chance that I need a fine-grained Z-sorting. low/normal/high priority should be enough in 99% of the cases, and the remaining case could likely be "sorted" out at GOB instanciation by internally sorting the OAMs the GOB received.
edit: I asked myself the right question: 'is that move out of DMA sync'ing gonna cost me framerate or not ?'. From that point on, I hacked a modified version of my game engine, that uses CPU-driven copy with merging of OAM and rotations into one single table (as the DS hardware expects) and started Apple Assault with aggressivity setting turned at max (so many apples that you'll experience "ghost" berry bats and shots). Good news: it still works flawlessly (that is, I can't observe any slowdown). Next: the depth-re-ordering...
edit+++: checked that a (Simple)Gob set on zlist[0] appears in front of (Compound)Gobs set on zlist[1]... now I've got to figure out how to use that best to implement the pullmask for my "roll jump" animation ...