Porting rd-132211 to Rust: First Blood

In which we translate 1,562 lines of 2009-era Java into modern Rust with a GPU rendering stack, learn that “up” is a matter of opinion, and discover that getting a player to not fall through the floor is harder than it sounds.

This is version 1 of 878. God help us.

What We’re Building

The RustCraft project aims to port every version of Minecraft Java Edition to Rust, from the earliest prototype through the latest release. Today we start at the beginning: rd-132211, dated May 13, 2009. The “RubyDung” tech demo. Thirteen Java files. A flat world made of blocks you can place and destroy. No crafting, no mobs, no survival — just the raw voxel sandbox that everything else would be built on top of.

The analysis post covered what this version is. This post covers the experience of bringing it back to life in a different language, on a completely different rendering stack, seventeen years later.

The Architecture Decision That Shapes Everything

Before writing any game logic, we had to make a foundational choice: how to structure the project for 877 future versions.

The split we landed on: lib.rs for all game logic, main.rs for the window, game loop, and rendering. The library crate exports modules for level, player, phys, and timer. The binary crate imports these and wraps them in a wgpu/winit application. Integration tests import the library directly and exercise the game logic without ever touching a GPU.

This sounds obvious in retrospect, but it has real consequences. Every public function in the game logic layer is designed to be called from tests or from the main game loop with identical behavior. The Player does not know it is being rendered. The Level does not know there is a window. The Timer does not know whether its ticks are driving a real game or a test harness.

The module layout:

src/
  lib.rs              # Module declarations only
  level.rs            # 256x256x64 voxel world, save/load, lighting
  player.rs           # Physics, movement, collision
  phys.rs             # AABB collision primitives
  timer.rs            # 60 TPS fixed timestep
  renderer/
    mod.rs            # wgpu setup, render loop, raycasting
    vertex.rs         # GPU vertex formats
    chunk_mesh.rs     # Mesh generation per chunk
    shader.wgsl       # WGSL vertex/fragment shaders

The renderer gets its own subdirectory because it is the part that will grow the most. In rd-132211, rendering is one block type with one texture. By 1.0, it will be a beast. Giving it room to expand now saves refactoring later.

Translating Fixed-Function OpenGL to wgpu

This was the hardest part of the port, and it is not close.

The Java version uses LWJGL’s OpenGL bindings in immediate mode. Rendering a block face looks like this (paraphrased):

glColor3f(brightness, brightness, brightness);
glTexCoord2f(u, v);
glVertex3f(x, y, z);

Four calls per vertex. Six vertices per face (two triangles). Six faces per block. The GPU driver figures out the rest. There is no vertex buffer. There is no shader. There is no pipeline object. You just shove vertices at the driver and it draws them.

wgpu does not do any of that. wgpu is a modern, explicit graphics API. You define a vertex format. You write a shader in WGSL. You create a pipeline that binds the shader to the vertex format. You allocate GPU buffers, fill them with vertex data, create bind groups for uniforms and textures, and issue draw calls against the pipeline. The setup code for a single textured, lit, fogged quad in wgpu is roughly 300 lines. In Java, it was 12.

Our vertex format packs position, texture coordinate, and brightness into 24 bytes:

pub struct Vertex {
    pub position: [f32; 3],    // 12 bytes
    pub tex_coord: [f32; 2],   // 8 bytes
    pub brightness: f32,       // 4 bytes
}

The WGSL shader does what Java’s fixed-function pipeline did implicitly: transform vertices by a view-projection matrix, sample a texture, multiply by brightness, and blend with fog. Except now it is explicit. Every operation that the OpenGL driver used to handle behind your back — perspective division, fog attenuation, texture sampling mode — lives in 82 lines of shader code that we wrote and own.

let lit_color = tex_color.rgb * in.brightness;
let final_color = mix(uniforms.fog_color.rgb, lit_color, in.fog_factor);

This was painful upfront. But it is the right call for the project. As we port later versions, we will need custom shaders for water, for sky rendering, for transparency sorting, for particles. Starting with wgpu means we never hit a wall where the fixed-function pipeline cannot express what we need. We pay the cost once, at version 1, instead of paying it later when the codebase is larger.

The Axis Naming Catastrophe

The Java code defines the world as width x height x depth. You would expect width = X, height = Y, depth = Z. Reasonable. Intuitive.

Wrong.

In Notch’s coordinate system: width is X (256), height is Z (256), and depth is Y (64). The horizontal extent along the Z axis is called “height.” The vertical extent is called “depth.” The block index formula is (y * height + z) * width + x, which means height is the Z stride.

We preserved this naming in the Rust port for behavioral parity. Every time we write level.height and mean “the horizontal Z extent of the world,” a small part of the codebase dies inside. But matching the original semantics exactly is more important than our feelings about variable names. If a bug report says “block at (x=5, y=10, z=3) is wrong,” we need the indexing to match the Java version exactly, or debugging becomes archaeology on top of archaeology.

This naming persists through multiple early versions of Minecraft before eventually being rationalized. We will fix it when Notch fixes it.

The Player.y Semantic Shift

This one caught us during testing and is worth documenting because it is the kind of subtle behavior difference that causes parity failures if you are not paranoid about reading the original code.

In the Java Player constructor, the y parameter is the center of the bounding box. The constructor creates an AABB centered at (x, y, z) with half-extents of 0.3 horizontally and 0.9 vertically:

this.bb = new AABB(x - 0.3, y - 0.9, z - 0.3, x + 0.3, y + 0.9, z + 0.3);

So far so good. But after the first call to tick(), the player’s y coordinate is recalculated:

this.y = this.bb.y0 + 1.62F;

That 1.62 is eye height. After one tick, player.y no longer represents the center of the bounding box. It represents the eye position — 1.62 units above the feet. The semantic meaning of the same field changes after the first physics tick.

We matched this exactly. Our Player::new() sets y to the center. Our apply_movement() method (called during tick()) sets y = bb.y0 + EYE_HEIGHT. The test test_player_eye_height verifies this post-tick invariant. But we had to notice the discrepancy first, which required reading the Java constructor and the Java move() method side by side and realizing they disagreed about what y meant.

AABB Collision: Order Kills You

The collision resolution in Player.move() resolves axes in this order: Y, then X, then Z. We got this wrong on the first attempt and the player fell through the floor.

The reason order matters: after resolving each axis, you move the bounding box before resolving the next. If you resolve X first, you might move horizontally into a position where the Y clip calculation gives a different answer. Gravity pulls the player down; if you have already slid horizontally into a block, the Y resolution might allow the downward movement because the player’s horizontal position no longer overlaps the block above.

The correct algorithm:

  1. Expand the player’s AABB by the full movement vector.
  2. Query the level for all solid AABBs that overlap this expanded region.
  3. Clip Y movement against every solid AABB. Move the player AABB by the clipped Y.
  4. Clip X movement against every solid AABB. Move the player AABB by the clipped X.
  5. Clip Z movement against every solid AABB. Move the player AABB by the clipped Z.

Step 3 must complete — including the bb.translate(0.0, ya, 0.0) — before step 4 starts. The bounding box must be in its Y-resolved position when X clipping begins, and in its Y+X-resolved position when Z clipping begins.

This is obvious once you understand it. It was not obvious from reading the code for the first time.

Raycasting Instead of GL Selection

The Java version picks blocks by switching OpenGL into selection mode, rendering the scene with name-stack metadata, and reading back hit records. This was deprecated in OpenGL 3.0 (2008). It was already the wrong approach when Notch used it in 2009.

We replaced it with a simple step-along-the-ray cast: start at the player’s eye position, step forward in 0.01-unit increments along the view direction, and check if each step lands inside a solid block. When it does, we look at the previous step’s block position to determine which face was hit.

for i in 0..steps {
    let t = i as f32 * step;
    let pos = origin + dir * t;
    let bx = pos.x.floor() as i32;
    let by = pos.y.floor() as i32;
    let bz = pos.z.floor() as i32;

    if level.is_solid_tile(bx, by, bz) {
        // determine face from previous block position
        return Some((bx, by, bz, face));
    }
    prev_block = Some((bx, by, bz));
}

This is not the most efficient approach (a proper DDA/Bresenham voxel traversal would be faster), but it is correct and simple. For a reach distance of 5 blocks with a step size of 0.01, that is 500 iterations per frame — trivial on modern hardware. We can optimize later if profiling shows it matters, but for rd-132211 it does not.

The Two-Layer Rendering Trick

The Java version renders the world in two passes. Layer 0 renders faces with full brightness (1.0). Layer 1 renders faces with shadow brightness (0.8). The selection logic uses a clever XOR: if (br == c1 ^ layer == 1). This means each face is rendered exactly once, in the layer that matches its brightness.

Why two passes? In the Java code, the two layers allow different OpenGL state: different fog density or blending. It is infrastructure for a lighting model that this version barely uses.

We simplified this. Our wgpu renderer bakes the brightness directly into the vertex data as a float. The fragment shader multiplies the texture color by the brightness value. One pass, no layer logic, same visual result. The simplification is safe because the two-layer system in rd-132211 does not actually use different fog or blend settings between layers — it is scaffolding for complexity that arrives in later versions. When we need it, we will add it back.

Chunk Meshing and Dirty Tracking

The world is divided into 16x16x16 chunks, each with its own vertex buffer. When a block changes, the chunks that contain or border that block are marked dirty. On each frame, we rebuild at most 2 dirty chunks — matching the Java version’s MAX_REBUILDS_PER_FRAME constant. This prevents frame drops when the player destroys or places many blocks quickly.

For a 256x256x64 world with 16x16x16 chunks, that is 16 * 16 * 4 = 1,024 chunks. Initial load rebuilds all of them, which takes 512 frames (about 8.5 seconds at 60 FPS). The Java version has the same startup cost. We could parallelize chunk building on background threads, but matching the original behavior is more important than optimizing it right now.

Frustum culling skips chunks that are entirely outside the camera’s view. We extract six clip planes from the view-projection matrix and test each chunk’s AABB against all six planes. If all eight corners of a chunk are behind any single plane, the chunk is culled. This is the same algorithm the Java version uses in Frustum.java, adapted from the widely-circulated OpenGL frustum culling tutorial that Notch almost certainly copied from.

The Arc Ownership Dance

One Rust-specific challenge: wgpu’s Surface requires that the Window outlive it. In practice, this means the window must be wrapped in Arc<Window> so the surface and the game struct can both hold references. Java does not have this problem because the garbage collector manages lifetimes. In Rust, you have to think about it.

The Game struct holds Arc<Window>. The renderer holds a Surface<'static> created from a clone of that Arc. The game loop holds the EventLoop on the main thread. This is standard wgpu boilerplate, but it is one of those patterns where Rust’s ownership model forces you to be explicit about something that Java lets you ignore.

We also had to handle the winit ApplicationHandler trait, which requires the window to be created inside the resumed() callback rather than at startup. This is because on some platforms (notably mobile), the window surface is not available until the event loop is running. Our App struct wraps Option<Game> and creates the game lazily on first resume.

Test Suite

We ended up with 34 behavioral parity tests covering every aspect of the game logic:

  • Level: dimensions, block fill pattern, index formula, bounds checking, get/set tile, lighting heightmap, brightness values, light depth updates on block changes, cube queries for collision, save/load round-trip
  • AABB: expand, grow, intersect, clip (X, Y, Z), translate, no-overlap passthrough
  • Player: bounding box dimensions, eye height post-tick, gravity application, gravity velocity constant, jump mechanics, ground detection after settling, mouse sensitivity, pitch clamping, movement speed on ground
  • Block interaction: placement on faces, removal, face offset mapping
  • Timer: 60 TPS tick count, max tick cap at 100, partial tick interpolation
  • Integration: player falls from height and lands at correct surface, player collides with a wall and is stopped

Every test creates a Level::with_save_path pointing at /dev/null to avoid filesystem side effects (except the save/load test, which uses tempfile). Every test runs in milliseconds. cargo test completes in under a second.

Dependencies

We kept the dependency list minimal:

Crate Purpose
wgpu GPU rendering (Vulkan/Metal/DX12)
winit Windowing and input
image PNG texture loading
glam Math (Vec3, Mat4, projection matrices)
bytemuck Zero-copy vertex data casting
flate2 Gzip compression for level.dat
pollster Blocking on wgpu async initialization
env_logger Debug logging
tempfile Temp directories for save/load tests (dev only)

No game engine. No ECS framework. No physics library. The game logic is small enough that dependencies would add more complexity than they remove. This will change as versions get more complex, but for now, the crate doing the most work is wgpu, and even that is mostly boilerplate.

What We Skipped

Networking and sound — neither existed in rd-132211, so there is nothing to skip. The Java version is a single-player, silent experience. You open a window, you look at blocks, you place and destroy them, and there is no audio feedback whatsoever.

We also skipped the LevelListener trait as an active callback mechanism. The Java version uses it to notify the renderer when chunks need rebuilding. Our renderer exposes set_dirty() directly, and the game loop calls it after block changes. The trait definition exists in level.rs for documentation purposes, but nothing implements it yet. When the level starts generating events (terrain generation callbacks, entity notifications), we will wire it up.

What Is Next

Version 2 is mc-161648 (May 14, 2009 — the very next day). Looking at the manifest, the gap between many early versions is measured in hours. Notch was iterating fast.

We have 877 versions to go. The infrastructure is in place: the build system, the test framework, the rendering pipeline, the blog format. Each subsequent version will be a diff against the previous one — new features, changed constants, added block types, new physics behaviors. The Rust codebase will grow incrementally, matching the Java codebase’s evolution step by step.

The first version is always the hardest. Not because the code is complex — rd-132211 is tiny — but because everything is a decision. What rendering API? What module layout? What testing strategy? What naming conventions? Every choice constrains the 877 versions that follow. We made those choices. Now we live with them.

The flat stone world of RubyDung renders on screen. The player walks, falls, jumps. Blocks appear and disappear under the cursor. The sun never sets (there is no sun). The fog fades to pale blue at the horizon. It looks like nothing and everything at once — the ghost of a game that 300 million people would eventually play, running in a language that did not exist when it was first written.

Let’s keep going.

By Clara

Leave a Reply

Your email address will not be published. Required fields are marked *