Monday, May 27, 2024

Compile-Time Regular Expressions

Well, guess what: there's a C++ header/library out there that allows regular expressions compiled at build-time. And I've got state machine parsing code that could use a little performance boost, especially if I envision GBA support rather than NDS for some future release.

#include <ctre.hpp>
#include <optional>

std::optional<std::pair<std::string_view, std::string_view>> 
match(std::string_view sv) noexcept {
    if (auto re = ctre::match<"state([0-9]++) *-> *state([0-9]++) *on (hit|found|event|fail)">(sv)) {
        return std::pair{
            std::string_view(re.get<1>()),
            std::string_view(re.get<2>())
        };
    }
    return std::nullopt;
}

Is what the siscanf(ln, "state%d -> state%d on %[hitfoundeventfail]%n") test for state transitions would look like, for instance. Some things will be cheaper, like testing explicitly for one of the 4 transition types rather than getting any garbage word you could craft with their letters and testing if (strcmp(reason, "event")) afterwards. 

Other things will require more code, like converting re.get<1>() into a digit as a post-processing step, since regexp do solely text matching and extraction, no text-to-number conversions. 

The code generated was quite convincing on x86_64, I'm a bit more suspicious about its ability to improve performance on 32-bit ARM processor, given how it requires 9 instructions to match every single character of the "state" constant string, for instance. If it helps for speed, it might have a significant impact on code size...

It seems to build even for GCC 10.2.0, the latest I installed from devkitpro, but not for the one used to build SchoolRush ...

No comments: