Sunday, November 11, 2018

B-trees

i was thinking a bit more about my handscript recognition system for the NDS this week-end, and i figured out that the sorting techique might not need to be as complex as I initially thought.

The plan is to support the 128 characters of the ASCII charset. There should be a way to define multiple models for the same character — e.g. you might have the ‘d’ character done in either 1 or 2 strokes — but I doubt I would ever need much more than 4 model of each one. That means at most 1K models to sort.

In my technique they can all be indexed with a 16-bit code, but even then storing 64 K pointers would be overkill. I had plans for techniques that find the most discriminating bit(s) for a subset of the models an make an optimal radix tree, but then I realised that with only 1K entries, it might even not be required: a simple B-tree structure would index all the keys with just 64 index blocks of 16 pointers each, plus 4 super_index blocks and one top-level index. This is a perfectly sustainable amount of overhead. Plus I already have a B-tree implementation from my Clicker32 operating system project. . .

No comments: