All-Purpose Mat's Blog

foss

As a game developer, some of my most creative work has come from embracing limitations rather than fighting against them. As counterintuitive as it sounds, clamping down on hardware capabilities or abstractions forces you to think outside the box much more.

To give you this experience, there's online fantasy consoles such as PICO-8 (nonfree) and TIC80 which make it super accessible to prototype and finish small experiences. There's also hardware like the Playdate (nonfree) that further plays with input methods and form factors to really constrain your playground. Finally, there's the thriving homebrew communities around consoles such as the SNES and the N64 (check out this awesome demake of Portal!).

I've personally always had a soft spot for the Wii. Partially because I grew up with its incredible games such as Super Mario Galaxy 2 but also because Wii game modding gave me a peek at what would later be my career: game development. Although I've dabbled with Wii development in the past, I never felt I really understood what I was doing. A couple months ago, I set out to fix this. Armed with my finished DirectX assignment for a university Graphics Programming course, and the open door of “you can add extra features to raise your grades, but those are not mandatory,” I thought of this: what if I show up to the exams with my Wii, and do the presentation on it?

A picture of a messy table with a Wiimote and GameCube controller, with a CRT hooked up to an offscreen GameCube showing a vehicle with a fire effect behind it

DirectX on the Wii (jk)

As excited as I was to enact this idea, I knew that I wasn't just going to compile my DirectX shaders and code for the Wii's CPU and call it a day. DirectX is, uh, not very portable or compatible with the Wii. The Wii is equipped with a GPU codenamed “Hollywood,” which has a whopping 24MB of video RAM as well as featuring no hardware support for any sort of shader. It really makes you appreciate some of the amazing scenes crafted on this console.

A shot of the starting area in Slimy Spring Galaxy > Click here to explore Slimy Spring Galaxy on noclip.website

So, we must speak Hollywood's own API (called GX) to coax it into rendering a mesh with textures and transparency (as required by the assignment).

NOTE: In the final project, I've created a GX folder to hold all GX-specific code, and isolated the DirectX stuff into a separate folder called SDL. This way, I can control which platform-specific code is used via a simple CMake option. If you'd like to follow along, you can find everything here.

libogc

To access this API from C++, there's a library maintained by the folks at @devkitPro@mastodon.gamedev.place called libogc. This library, combined with the PowerPC toolchain, allows one to build programs targeting the Wii (and the GameCube, since they're so similar!).

NOTE: whenever I refer to the Wii from now on, it (mostly) also applies to the GameCube.

Although devkitPro themselves do not have a CMake toolchain file available, I was able to find an MIT licensed one courtesy of the rehover homebrew game. Passing this toolchain file to CMake automatically sets it up to build for the Wii. Cool stuff!

NOTE: The rest of this post is accurate to the best of my understanding, but it is likely I got some things wrong! If you need accurate info, I suggest you take a look at libogc's gx.h with all the functions and comments as well as the official devkitPro GX examples rather than following my own code. Comments, questions, and corrections are as always welcome at my fedi handle @mat@mastodon.gamedev.place !

Video setup

I won't dwell on the init too long, as most of it is just taken from a libogc example it's not too thrilling. What is cool though is how, to do v-sync, we create two “framebuffers” which are merely integer arrays... on the CPU? This is where one of the big differences in the Wii's hardware design compared to a modern computer comes in: both the CPU and GPU have access to 24MB of shared RAM. Meanwhile on a modern PC, the GPU will exclusively have its own dedicated RAM which the CPU cannot touch directly.

This shared RAM is where we store these framebuffers arrays, named by the Wii hacking scene “eXternal Frame Buffers” or XFBs (source. Because access to this so-called “main”)). Because having the GPU work on the XFB directly would be slow, the GPU has its own bit of actually private RAM which stores what's officially called the “Embedded Frame Buffer” (EFB). GX draw commands work on the super-fast EFB, and when our frame is ready we can copy the EFB into the XFB, for the Video Interface to read and finally display to the screen. This buffer copy is loosely equivalent to “presenting” the frame as is done in the APIs we're used to.

                 ┌─────────┐                           
                 │         │                           
       ┌─────────┤   CPU   ├───────────────────┐           
       │         │         │                   │           
       │         └─────────┘                   ▼           
       │                                  GX drawcalls      
       │                                       │           
       ▼                          ┌────────────▼──────────┐
Create XFB arrays                 │                       │
       │                          │ GPU private RAM (EFB) │
       │                          │                       │
       │                          └────────────┬──────────┘
       │                                       ▼           
       │                           Copy EFB to current XFB 
       │                                       │           
       │      ┌───────────────────────┐        │           
       │      │                       │        │           
       └──────► Shared MEM1 RAM (24MB) ◄───────┘           
              │                       │                    
              └─────────┬─────────────┘                    
                        ▼                              
                  Display frame                        
               ┌────────▼────────┐                     
               │                 │                     
               │ Video Interface │                     
               │                 │                     
               └─────────────────┘                     

The following code specifically handles finishing the frame and displaying it:

void GraphicsContext::Swap()
{
    GX_DrawDone();

    GX_SetZMode(GX_TRUE, GX_LEQUAL, GX_TRUE);
    GX_SetColorUpdate(GX_TRUE);
    GX_CopyDisp(g_Xfb[g_WhichFB], GX_TRUE);

    VIDEO_SetNextFramebuffer(g_Xfb[g_WhichFB]);
    VIDEO_Flush();
    VIDEO_WaitVSync();
    
    g_WhichFB ^= 1; // flip framebuffer
}

Every time a frame is done, we tell the GPU we're done via GX_DrawDone() and then do our EFB –> XFB copy via GX_CopyDisp(g_Xfb[g_WhichFB], GX_TRUE) (where g_Xfb is our two XFBs and g_WhichFB is a single bit we flip every frame). Then we notify the Video Interface of the framebuffer it should display with a call to VIDEO_SetNextFramebuffer(g_Xfb[g_WhichFB]). Finally, VIDEO_Flush() and Video_WaitVSync() ensure we don't start rendering the next frame before this one is displayed.

Drawing a mesh

Now that we know how framebuffers work on the Wii, let's get to drawing our mesh!

Vertex attribute setup

Before we can start pushing triangles via GX, we must tell it what kind of data to expect. This is done in two steps:

  1. First, we tell GX that we will be giving it our vertex data directly every frame, rather than having it fetch it from an array via indices.
    GX_SetVtxDesc(GX_VA_POS, GX_DIRECT);
  2. We then access the vertex format table at index 0 (GX_VTXFMT0), and set its position attribute (GX_VA_POS) as follows:
    • The data consists of three values for XYZ coords (GX_POS_XYZ)
    • Each value is a 32-bit floating point number (GX_F32)
    • I'm not sure what the last argument is for, but zero worked fine for me.

GX_SetVtxAttrFmt(GX_VTXFMT0, GX_VA_POS, GX_POS_XYZ, GX_F32, 0);

Both of those functions are then repeated for normals and texture data, if needed.

NOTE: The Wii's GPU supports indexed drawing, where vertex data is stored in an array and drawn using indices into that array. This allows fewer vertices to be defined while reusing them.
I didn't know about this until I finished this project, so we'll be sticking with non-indexed drawing. The concept is quite similar, but you'd set the vertex desc to GX_INDEX8 and bind an array before calling GX_Begin. You'd then pass indices rather than vertex data inside the begin/end block.

Drawcalls

Each frame, we must queue up some commands in the GPU's first-in-first-out buffer. We can tell GX it's time to draw some primitives via the GX_Begin function, passing along the type of primitives (triangles!), the index in the vertex format table we filled in earlier, and the number of vertices we'll be drawing. Afterward, we can give it the data in order by calling the respective function for each attribute we configured. Finally, we cap it off with a GX_End (which libogc just defines as an empty function, so I guess it may just be syntax/API sugar).

GX_Begin(GX_TRIANGLES, GX_VTXFMT0, m_Vertices.size());
for(uint32_t index : m_Indices)
{
    // NOTE: really wish I used GX's indexing support...
    const Vertex& vert = m_Vertices[index]; 

    GX_Position3f32(vert.pos.x, vert.pos.y, vert.pos.z);
    GX_Normal3f32(vert.normal.x, vert.normal.y, vert.normal.z);
    GX_TexCoord2f32(vert.uv.x, vert.uv.y);
}
GX_End();

Transformations

NOTE: This section will assume you're familiar with matrix transformations. If you don't know what this is, here's a link to the first of two pages in the OpenGL tutorial discussing this, which is the explanation that finally made it click for me.

The first important matrix is the model matrix. This matrix's job is to convert model-space vertices into world-space. This is useful when we want to rotate, scale, or translate an object in the world.

To look around in our scene, we need to set up a view matrix, which takes care of translating world-space into view-space. Finally, we'll need a projection matrix that turns the given view-space into clip-space, at which point the GPU takes over and handles stuff like culling and converting to non-homogeneous coordinates.

In normal graphics, we tend to pair view and projection together, and leave model on its own for transforming normals and other data to worldspace in the shader. The Wii however takes a different approach: load the combined modelView matrix, and separately handle the projection.

The reason for this is quite interesting: GX instead expects you to give it light information in view space rather than the usual worldspace. We'll cover simple lighting in a later section.

So, we must only set these two matrices to handle all of our transformation needs:

    GX_LoadProjectionMtx(projectionMat, GX_PERSPECTIVE);

    // use the same matrix for positions and normals
    GX_LoadPosMtxImm(modelViewMat, GX_PNMTX0);
    GX_LoadNrmMtxImm(modelViewMat, GX_PNMTX0);

A lone untextured cube on a blue background

Textures

Textures are actually really easy! We can directly bind a byte array as a texture, since the CPU and GPU have that 24MB of shared RAM.

I initially tried to use the Wii's native format (TPL), which has some really cool features such as the CMPR compressed texture encoding, which has the GPU decompress the texture live when it needs the data, at (seemingly) no performance cost. Awesome!

Sadly, I couldn't get it working...

The vehicle, with a rainbow corrupted-looking texture

Even using basic TPL, there were some gnarly artifacts: A close-up of a wing from the vehicle, with bizarre texture artifacts

I finally caved and decided to just use PNG and decode it to a raw RGBA8 byte array, bypassing TPL entirely. This got rid of the artifacts, so I guess we'll never know why they happened!

GX_InitTexObj(&m_Texture, decodedData, width, height, GX_TF_RGBA8, GX_CLAMP, GX_CLAMP, GX_FALSE);

To use the texture, we can simply bind the texture object that we got during init to the index we want to sample from:

GX_LoadTexObj(const_cast<GXTexObj*>(&m_Texture), GX_TEXMAP0);

By default, GX reads from GX_TEXMAP0 when it draws triangles, so this is actually all we needed to do!

The vehicle, with textures

Transparent textures

We can set up blending with the alpha channel like so:

GX_SetBlendMode(GX_BM_BLEND, GX_BL_SRCALPHA, GX_BL_INVSRCALPHA, GX_LO_OR);

This tells GX that, when blending two transparent samples, it should take the previous pixel's alpha value (GX_BL_SRCALPHA) and the inverse of the new one's alpha (GX_BL_INVSRCALPHA). I'm not sure what the GX_LO_OR is for, but blending sure does seem to work so I'm keeping it. There's a good explanation of this exact blend function over at LearnOpenGL.

Although on a first glance transparency seems to work, there's a pretty big issue that appears if you look at the fire effect from close up (I don't have a screenshot from the Wii build, so this one's from DirectX, however the same effect is visible)!

A close-up of the fire effect, where some planes are writing to the Z-buffer and causing fire that should be drawn behind it to get skipped instead

One of the triangles that makes up the effect is getting drawn before another triangle that should render behind it... the first one writes to the Z-buffer, causing the second triangle to get discarded. This is usually good, because it skips drawing pixels that are fully occluded, and makes sure stuff that's behind a model doesn't end up getting drawn over it. In the case of translucent images however, we get artifacts like the one above.

This image was rendered with the Z buffer entirely disabled, which shows why we need it: The vehicle, but with bad Z sorting

The solution is thankfully quite simple:

if(m_UseZBuffer)
{
    GX_SetZMode(GX_TRUE, GX_LEQUAL, GX_TRUE);
}
else
{
    GX_SetZMode(GX_TRUE, GX_LEQUAL, GX_FALSE);
}

Set m_UseZBuffer to false for models using transparent textures, and that last GX_FALSE in the GX_SetZMode disables writing to the Z buffer. Note that we still want reading (the first GX_TRUE), as otherwise the fire effect would end up rendering over our vehicle mesh!

”““Shaders”“”

Unlike modern APIs, the Wii's GPU is not programmable with arbitrary shaders. Instead, we can play with something quite powerful called texture evaluation (TEV) stages. We've got a whopping 16 TEV stages to play with, which Nintendo graciously calls a “flexible fixed-pipeline.” Each stage is essentially a configurable linear interpolation (lerp) between two values A and B by a factor of C. Finally, a fourth value D is added to the result.

u8 TEV_stage(u8 a, u8 b, u8 c, u8 d)
{
    return d + (a * (1.0 - c) + b * c);
}

NOTE: There's also optional negation, scale, bias, and clamping. I'm skipping over them here because I didn't end up using them. There's more complete documentation available here.

The source of A, B, C, and D can all be configured per stage. You could, for example, have it lerp between your texture's color and the light color based on the amount of specular lighting it receives. I tried to set this up with lots of help from Jasper (thanks again!) but ultimately it didn't work. I'd like to try again sometime in the future!

Diffuse lighting

The Wii's GPU features built-in per-vertex lighting. This means that you can (optionally) tell it to calculate how much light each vertex receives from up to eight light sources, which can be either distance-attenuated (like a lamp) or angle-attenuated (like a spotlight).

GX provides a type GXLightObj that we can load and then set up with all our parameters. For the renderer I was making, I needed to set up a “sun” light, which is a very far away point light with (practically) no attenuation.

NOTE: normally in graphics programming, this is be done with a simple directional light. However the way I got it to work on the Wii was by simulating this attenuation-free point light model, so I went with that.

This is the bit of code that initializes it every frame:

GX_SetChanAmbColor(GX_COLOR0, ambientColor);
GX_SetChanMatColor(GX_COLOR0, materialColor);

GX_SetChanCtrl(
        GX_COLOR0, GX_ENABLE,
        GX_SRC_REG, GX_SRC_REG,
        GX_LIGHT0, GX_DF_CLAMP, GX_AF_NONE);

guVector lightPos = { -lightDirWorld.x * 100.f, -lightDirWorld.y * 100.f, -lightDirWorld.z * 100.f };
guVecMultiply(viewMatNoTrans, &lightPos, &lightPos);

GXLightObj lightObj;
GX_InitLightPos(&lightObj, lightPos.x, lightPos.y, lightPos.z);
GX_InitLightColor(&lightObj, lightColor);

GX_LoadLightObj(&lightObj, GX_LIGHT0);

Let's go over each step.

Color registers

First, we tell GX what ambient and material colors we'll use. The ambient color is used for lighting all vertices, no matter of received light. This makes sure the back of our mesh is not just pure black. The material color will tint your whole model (it's like a global vertex color), so I keep it as white.

Channel setup

GX_SetChanCtrl configures the lighting channel we'll use. We want the light to affect GX_COLOR0, which is where our texture will be. We tell it to get the ambient and material color from the registers we set just before (GX_SRC_REG). We set GX_LIGHT0 as a light that affects this channel, with the default diffuse function GX_DF_CLAMP. Finally, we disable attenuation by passing GX_AF_NONE, meaning our light can be infinitely far away but still light our model as if it were right next to it.

Position transformation

We then calculate the light position, which is very far away opposite to the direction it'll shine. Note that we multiply it with the view matrix (with the translation part stripped out) as light stuff is in view space!

Light object creation

Finally we create our GXLightObj, giving it its position and color, and load it into the GX_LIGHT0 channel. Make sure to disable lighting on the fire (it makes its own light, wouldn't make sense to be in shadow) and wham! There's our sun!

Final picture of the Wii rendering of the vehicle

You can find all my lighting and TEV code in Effect.cpp. The filename is unfortunate, but as this was initially a DirectX project, I was stuck with that name from the header.

We're done!

I quickly built a GameCube version the night before the due date, and submitted the required .exe alongside my sneaky .dol binaries with no further elaboration. I wanted to keep the surprise. I showed up the next day to campus with a very full backpack, and when it was time pulled out the Wii to present my “extra features to raise your grades, but those are not mandatory.” It seemed to make quite a splash! Looks like I'm not the only one who grew up with the Wii :)

You can download a build here. Wiimote and GameCube controls are supported!

Tags for fedi: #homebrew #wii #gamecube #gcn #devkitpro #directx #linux #graphics #gamedev #foss #retrocomputing


Thanks for reading! Feel free to contact me if you have any suggestions or comments. Find me on Mastodon and Matrix.

You can follow the blog through: – ActivityPub by inputting @mat@blog.allpurposem.at – RSS/Atom: Copy this link into your reader: https://blog.allpurposem.at

My website: https://allpurposem.at