Bitplanes in a Sokoban Level Solver

I've been building a Sokoban editor and solver for tilebasedworlds.com — a side project for tile-based puzzle tools. The solver ships entirely in the browser as a WASM binary, which meant the first real question was not "how do I implement A*?" but "how do I represent a level?" That choice turns out to propagate through every part of the solver, the URL format, and the state-space design. The answer I landed on was bitplanes.

What XSB Gets Wrong for Computation

XSB is the de facto standard for Sokoban levels. It encodes the grid as printable ASCII — # for walls, $ for boxes, . for goals, @ for the player, space for empty floor, * for a box on a goal, + for the player on a goal. A level looks like this:

##########
#        #
#        #
#  @     #
#   $ .  #
#        #
#        #
##########

For exchange and display, XSB is perfectly adequate. It is human-readable, widely supported, and trivially round-tripped. But it is a poor basis for a solver.

The immediate problem is overhead: you are using one byte per cell when each cell needs only 3 bits — there are eight possible states and log2(8) = 3. Every byte is wasting 5 bits of information capacity. A more serious issue is that XSB encodes all entity types in a single pass. To ask “is there a wall at (x, y)?” you do a string index, parse a character, and check it against '#'. To ask “is there a box at (x, y)?” you must additionally account for '*' (box on goal). The multi-character overlap is a source of branch complexity that compounds when you are checking thousands of positions per second.

The deepest problem is hashing. A Sokoban solver must track visited states to avoid redundant search. The state is the box configuration — which cells currently contain boxes — plus a normalised player position. If your box configuration is a String or a Vec<char>, hashing it means walking every character in the grid, most of which are irrelevant (walls, floor). The hash is proportional to the total grid area, not to the number of boxes.

Arrays Are Better, But Not By Enough

The natural step up from XSB is a flat array — one u8 or enum per cell — with the grid dimensions stored separately. This is cleaner: no parsing, direct indexing by y * width + x, straightforward to iterate. Wall checks become an array lookup instead of a character comparison. It is fast enough for simple tasks.

What it does not fix is the hashing problem, and it introduces a new issue: the wall layout is stored redundantly in every search state. Walls never change. If you clone the level grid as part of each node in your A* search tree, you are copying megabytes of data that never differ between states. You can factor walls out into a shared reference, but then you have split the representation — some state is in the shared level, some is in the per-node clone, and the types reflect that awkward division.

What you actually want to hash is just the box positions. The wall layout and goal positions are static; the player position can be normalised to a single integer. The hash should be proportional to the number of changed things, not the size of the grid.

Bitplanes

A bitplane is a packed 1-bit-per-cell grid for a single entity type. Instead of one array encoding everything, you have separate planes — one for walls, one for boxes, one for goals. Each plane stores exactly one bit per cell. A 10×8 grid needs 80 bits, which is 10 bytes — 1.25% of what an equivalent byte array would use.

The structure is straightforward:

#[derive(Clone, Debug, PartialEq, Eq, Hash)]
pub struct BitPlane {
    pub width: u8,
    pub height: u8,
    data: Vec<u8>,
}

Bits are stored MSB-first: bit 0 of byte 0 is cell (0, 0). A flat index i = y * width + x maps to byte i / 8, bit 7 - (i % 8):

#[inline]
pub fn set(&mut self, x: u8, y: u8) {
    let i = Self::idx(self.width, x, y);
    self.data[i / 8] |= 1 << (7 - i % 8);
}

#[inline]
pub fn get(&self, x: u8, y: u8) -> bool {
    let i = Self::idx(self.width, x, y);
    self.data[i / 8] & (1 << (7 - i % 8)) != 0
}

get and set are marked #[inline] because they sit on the hot path — the solver calls them in tight loops. The implementation is two integer operations and a bit mask.

The critical property is that BitPlane derives Hash. This is possible because Vec<u8> is Hash, and the packed byte representation is the canonical form. No special hash implementation is needed. The hash of a bitplane is the hash of its bytes — compact, correct, and available for free.

How the Solver Uses Them

The SokobanLevel splits into three planes immediately on parse:

pub struct SokobanLevel {
    pub(crate) width: u8,
    pub(crate) height: u8,
    pub(crate) walls: BitPlane,
    pub(crate) boxes: BitPlane,
    pub(crate) goals: BitPlane,
    pub(crate) player_pos: u16, // flat index
}

Walls and goals are static. The solver holds a reference to the static level data and clones only the boxes plane into each search state:

#[derive(Clone, Eq, PartialEq, Hash)]
struct State {
    boxes: BitPlane,
    player_norm: u16,
}

This is the entire search state — a packed bit vector and a two-byte integer. Cloning a state for a 10×8 grid costs 10 bytes of heap allocation for boxes, plus the integer. Hashing it is proportional to those 10 bytes, not to grid area. At 50,000 nodes explored, this difference is felt.

Flood Fill and Player Normalisation

A key insight in Sokoban solving is that the specific player position does not define a unique state — what matters is which cells the player can reach. Two states with the same box configuration but player positions in the same connected region are equivalent. The solver normalises the player to the topmost-leftmost reachable cell, so equivalent states hash identically and are pruned.

The flood fill that computes reachability produces a BitPlane:

fn flood_fill(
    start_x: u8, start_y: u8,
    walls: &BitPlane, boxes: &BitPlane,
    width: u8, height: u8,
) -> BitPlane {
    let mut visited = BitPlane::new(width, height);
    let mut queue = VecDeque::new();
    // ...BFS expanding into cells that are neither walls nor boxes
    visited
}

The result is immediately usable as a reachability mask. To check whether the player can stand at (pfx, pfy) to make a push, the solver calls reachable.get(pfx, pfy) — one operation, no separate data structure needed.

Dead Square Pruning

Cells from which a box can never reach any goal — dead squares — are precomputed once when the solver is initialised. A box pushed into a non-goal corner is irrecoverable. So is a box pushed to the end of a corridor that has no goal and is bounded by walls on both sides.

The result is another BitPlane:

pub struct Solver {
    walls: BitPlane,
    goals: BitPlane,
    dead: BitPlane,        // precomputed dead square mask
    goal_distances: Vec<Vec<u32>>,
    // ...
}

At each push generation, the solver prunes with a single call:

if self.dead.get(push.to_x, push.to_y) { continue; }

2×2 Deadlock Detection

Four cells forming a 2×2 block where every cell is either a wall or a box — and not every cell is a goal — is a freeze deadlock: none of those boxes can ever move. Detecting this requires checking eight positions (four per candidate corner) with no allocation. With bitplanes, that is eight get() calls on two already-loaded planes:

let all_blocked = (0..2u8).all(|i| (0..2u8).all(|j| {
    walls.get(cx + i, cy + j) || boxes.get(cx + i, cy + j)
}));

There is no intermediate data structure. The check reads directly from the planes that are already in scope.

The Wire Format Falls Out Naturally

Because bitplanes are already a packed binary representation, serialising a level for URL sharing requires almost no extra work. The binary format is:

[width: u8] [height: u8] [player_pos: u16] [wall_plane...] [box_plane...] [goal_plane...] [crc8: u8]

The three planes are concatenated as-is. For the 10×8 example level above, that is ceil(80/8) = 10 bytes per plane — 30 bytes of plane data, 4 bytes of header, 1 byte of CRC-8. The full level encodes to 35 bytes.

That 35-byte blob goes through base64url encoding and a version prefix:

v1-AQAKCAAh_-AYBgGAYBgH_wAAAAAACAAAAAAAAAAAAAIAAAAATA

Paste that into the editor at tilebasedworlds.com/play/sokoban-editor/ and it reconstructs the level — player, box, and goal positions intact — entirely client-side, with no server round-trip. The URL is the level.

Decoding is symmetric. fragment_to_level strips the version prefix, base64url-decodes the payload, reads the header to determine plane size, slices the three planes out of the byte array, and verifies the CRC. If the URL has been corrupted — truncated, character-swapped, partially rewritten — the CRC detects it before the level gets anywhere near the solver.

This is the part that vindicates the representation choice. A naive level format would require a separate serialisation step — convert the 2D array to binary, design a layout, handle the mismatch between the internal representation and the wire format. With bitplanes, the internal representation is the wire format. as_bytes() returns the backing slice directly, no transformation required.

What XSB Is Still Good For

This is not an argument to abandon XSB. It remains the right format for human authoring, level exchange, and display. The solver accepts XSB on input — SokobanLevel::from_xsb parses the string and fills the bitplanes in a single pass — and can emit it again for display. The round-trip is exact.

The point is that XSB should be treated as an interchange format, not as the working representation. The moment a level enters the solver, it should become bitplanes. The string stays at the edge; the bit-packed planes are what the solver thinks with.

The design decision that surprised me most was how far upstream the bitplane choice propagated. I expected to make a data structure decision and move on. Instead, choosing bitplanes determined the shape of the search state, made the hash implementation trivial, made reachability testing a single inline call, and gave me the URL encoding for free. The representation was not incidental to the design — it was the design.