We now set BP once at the start to point to the beginning of the BPB. Encoding an direct address access can then be done relative to BP, which saves an immediate byte per access.
this is actually working now.