ndesmic

Posted on Aug 17, 2021 • Edited on Nov 26, 2021

Basic WebGPU Rendering

#webgpu #vanillajs

Many of the articles I've been writing try to cover where applicable DOM + CSS, SVG, canvas and WebGL in order to display and array of options for doing the same task, after all everything is tradeoffs. One thing we have not touched on so far is WebGPU. WebGPU is the next big thing and aims to be to WebGL what Vulkan is to OpenGL. It's lower-level, more powerful and also a lot more verbose and complex. It even contains it's own unique shader language: Web GPU Shader Language (WGSL). I find this effort interesting as WebGPU's web focus might give it reach beyond platform API politics in ways even Vulkan cannot (and Apple is onboard in theory). As of this writing WebGPU is behind a flag in major browsers and still stabilizing but it is complete enough to start working with, at least in Chromium. It's perhaps time we can start implementing some of our previous problems using WebGPU.

It's going to be complex so let's try to take it slow. To give a taste, the following steps are needed to be able to draw anything:

1) Get adapter
2) Get device from adapter
3) Configure context
5) Create vertices
6) Create a shader program
7) Create a vertex buffer descriptor
8) Create a render pipeline
9) Create command encoder
10) Create a pass encoder
11) Submit the draw operations
12) Submit the command to the queue

Don't worry we'll step through it. I won't go down the side paths though, just mention that they are there and move on because there's a lot to cover. Also, as much as I'll try to keep it accessible to new users (and I'm certainly not an expert on graphics, just a learner) a bit of background on how GPUs operate and a some knowledge about another 3d graphics API like WebGL will be helpful to anchor yourself.

Finally because the spec is not set in stone it might change, especially the name of things. It shouldn't change a lot but the code might not work in the future (check the post date). I'll try to update if this happens though.

By the way, to actually test you'll probably want Chrome Canary and enable "unsafe WebGPU" in the flags. You'll get a scary message once you do.

Get an adapter

const adapter = await navigator.gpu.requestAdapter(options);

First we start out by getting and adapter. An adapter is basically an implementation of WebGPU. By passing in no options we get the default which will probably be what you want in most cases, but you can specifically request a high-power, low-power or software versions if available. The difference between high power and low power is for systems that have both integrated and discrete GPUs you can choose which one.

The adapter has some properties that list the specific features and limits of the GPU if you need them. The main thing we want to do is get a device.

Get a device

const device = await adapter.requestDevice(options);

Again we can pass in options to tell it what type of device we should get and again we can pass nothing to get the default. A device is probably closest to the context object you get from canvas or WebGL. It lets you create objects on the GPU and you'll be using it a lot.

Get canvas context

const context = canvas.getContext("webgpu");

The canvas context works differently in WebGPU as most of the meat is on the device. This is because WebGPU isn't just for drawing but also GPU computation. Here the context is just used to configure and get the textures that are drawn to the canvas.

context.configure({
  device,
  format: 'bgra8unorm'
});

We pass in the device to associate it with this canvas as well as the format of the texture that will be drawn here.

Create Vertices

const vertices = new Float32Array([
    -1.0, -1.0, 0, 1, 0, 0, 1,
    1.0, -1.0, 0, 0, 1, 0, 1,
    1.0, 1.0, 0, 0, 1, 1, 1,
    -1.0, 1.0, 0, 0, 0, 1, 1
]);

Ah, something familiar at least coming from WebGL. We make a list of vertices to send to the GPU so it can render them. But this time it's much more manual. In fact, as far as I can tell you are forced to pack the vertices. Whereas in WebGL we could bind multiple attributes, WebGPU seem to expect them in a single buffer with some descriptions for how to get at the data. In the above example we have a quad (-1,-1) (1, -1) (1,1) (-1,1) and each vertex has a color (1,0,0,1)(0,1,01)(0,1,1,1)(0,0,1,1). If we want UVs, normals etc. those need to be packed in there as well.

Then we send this data over to the GPU.

const vertexBuffer = device.createBuffer({
    size: vertices.byteLength,
    usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST,
    mappedAtCreation: true
});
new Float32Array(vertexBuffer.getMappedRange()).set(vertices);
vertexBuffer.unmap();

The first line creates a buffer on the GPU. We need the byte length (not the element length), followed by the usage flags. Here we're telling it we're making vertices and that it is the destination of a copy (this is excruciatingly verbose but I think the GPU does some magic to layout values efficiently or something using these tags) and that it's a vertex buffer (as opposed to, say, an index buffer). The last one mappedAtCreation needs a bit more explanation.

A "mapped" buffer means that the CPU can write to it, that is we can change the values in javascript. An "unmapped" buffer means the GPU has access to the buffer and the CPU doesn't. So it's a bit like an airlock. You "map" it to write new data and then "unmap" it to make it available to the GPU so both can't clobber each other. mappedAtCreation means that the buffer will start in a mapped (writable from JS) state. getMappedRange() creates an ArrayBuffer which represents the buffer on the GPU. getMappedRange(offset, size) can also be used to just get at parts of the buffer, by default you get the whole thing. By writing to this special ArrayBuffer we are actually uploading data to the GPU memory. Finally we unmap to allow it to be used on the GPU side of the fence.

Create a shader program

const shaderModule = device.createShaderModule({
  code: `...`
})

This is actually a bit easier to understand than WebGL IMHO. First off createShaderModule take options. code is required but you can also pass in a sourceMap property to enable source mapping. The code is text for the WGSL shader program. Unlike WebGL you'll include both the vertex and fragment shaders.

WGSL

So what does that program look like? The language is so spanking new right now I don't have code highlighting for it.

struct VertexOut {
  [[builtin(position)]] position : vec4<f32>;
  [[location(0)]] color : vec4<f32>;
};

[[stage(vertex)]]
fn vertex_main([[location(0)]] position: vec4<f32>,
               [[location(1)]] color: vec4<f32>) -> VertexOut
{
  var output : VertexOut;
  output.position = position;
  output.color = color;
  return output;
}

[[stage(fragment)]]
fn fragment_main(fragData: VertexOut) -> [[location(0)]] vec4<f32>
{
  return fragData.color;
}

First lets pay attention to the 2 stages which are marked with annotations [[stage(vertex)]] and [[stage(fragment)]], we'll also ignore all the [[location(x)]] annotations for now but just know things in [[ ]] are metadata annotations, important for the pipeline but not for the program semantics. Shaders are modeled as functions and not full programs like GLSL. We can see that the vertex input is a vec3<f32> for position and a vec4<f32> for color. The syntax is very Rust inspired but hopefully without knowing that you can tell these are vectors of float32s. The -> after the argument list flags the return value. In this case the return value is a VertexOut. This is a user-defined type. Unlike GLSL's magic global varying variables we actually tell it what the output structure looks like and that is defined at the top with the struct keyword. This is similar to a struct in other languages, it's just block of memory with a type. We output 2 things: a position and the color. The position is special as you may expect because the GPU internally uses this to draw in clip space. As such is has a special marker [[builtin(position)]]. This is required to be a vec4<f32>. The color is freeform but we're using 4-valued color vectors.

In the fragment stage we take in that output structure and return a vec4<f32> which is the color of the pixel/fragment. We're using a very simple pass through fragment shader. Like GLSL the fragment shader values are interpolated for you; that is it automatically generates the values between vertices but WebGPU even lets you change how that works.

Create a vertex buffer descriptor

const vertexBuffers = [{
    attributes: [
        {
            shaderLocation: 0,
            offset: 0,
            format: "float32x3"
        },
        {
            shaderLocation: 1,
            offset: 12,
            format: "float32x4"
        }
    ],
    arrayStride: 28,
    stepMode: "vertex"
}];

Previously we made a vertex buffer, a big chunk of data to send to the GPU with our vertex info. Now we tell the GPU how to decode that. Here we say that it's an array of vectors which comprise the attributes. The first vector is at offset 0 (per element) and is 3 float32s. The second is located immediately after that at byte 12 (f32 * 3 = 12 bytes) (per element) and is 4 float32s. Each of these has a shaderLocation. This is the same thing as getAttribLocation in WebGL only this time we manually tell it what it binds to. These directly correspond to the [[location(x)]] annotations we saw in the shader program. So the first attribute is the first parameter in the vertex shader and the second is the second. Finally we have the stride which is how big of a jump we have per element. (f32 * 3) + (f32 * 4) = 28 bytes per element. The stepMode is either vertex or instance. I haven't talked about instanced drawing but it's a way to speed up drawing multiples of the same thing.

Create a Render Pipeline

const pipelineDescriptor = {
    vertex: {
        module: shaderModule,
        entryPoint: "vertex_main",
        buffers: vertexBufferDescriptor
    },
    fragment: {
        module: shaderModule,
        entryPoint: "fragment_main",
        targets: [
            {
                format: "bgra8unorm"
            }
        ]
    },
    primitive: {
        topology: "triangle-list"
    }
};
const renderPipeline = device.createRenderPipeline(pipelineDescriptor);

Now we create the high-level pipeline. Each stage has a few tweakable parameters but it's mainly a boilerplate step. In this case we have the 3 important parts (the other options not shown deal with depth tests and stencil stuff which is more advanced). For the vertex stage we give it the shader modules we want to use, as well as the name of the entrypoint function for that stage (this definitely seem redundant to me since we annotated them but maybe it's expected to compile multiple frag/vert shader functions per module?) and the vertex buffer descriptor from the last step. Next we do the fragment shader, same idea for module and entryPoint but we also tell it the output format for the framebuffer (this should match the one we configured on the canvas). And lastly the primitive type. This is familiar from WebGL, it's the algorithm to explain how triangles are generated "triangle-strip", "triangle-list" etc. Here it's a triangle list, that is, each set of 3 vertices is one triangle, no vertices are shared between triangles.

Create a command encoder

const commandEncoder = device.createCommandEncoder();

Here's where things go off the rails. We need to create a thing called a "command encoder." I'm a little hazy on why exactly this is necessary and why we can't just submit things directly to the GPU and have the API do this step behind the scenes. But this basically does what it says. We create a set of commands to submit to the GPU and this "encodes" them.

Create a pass encoder

const clearColor = { r: 0.2, g: 0.2, b: 0.2, a: 1.0 };
const renderPassDescriptor = {
    colorAttachments: [
        {
            loadValue: clearColor,
            storeOp: "store",
            view: context.getCurrentTexture().createView()
        }
    ]
};
const passEncoder = commandEncoder.beginRenderPass(renderPassDescriptor);

This is probably the weirdest part and I'm not too sure about the details here. One of the types of commands we can pass the GPU is to tell it to render (the thing we want it to do in most cases). To do this we create a render pass. A render pass is a series of paint operations that are applied to ... something. So here we have colorAttachments which are saying that we will apply a color to a view. The view here is the texture that underlies the canvas element. There are other attachments like depth and stencil too but we won't worry about those. For all attachments we do some sort of load operation which is just setting the default value. So in the case of a color attachment, the load is the clear color. Then there's the storeOp. This is telling the GPU that it should write the result to the view.

We call beginRenderPass on our command encoder with the description and get back a pass encoder.

Submit the draw operations

passEncoder.setPipeline(renderPipeline);
passEncoder.setVertexBuffer(0, vertexBuffer);
passEncoder.draw(4);
passEncoder.endPass();

We setup the render pass structure, now we can submit the parameters for actually drawing something. We set the pipeline to be the render pipeline we setup. Then we submit the vertex buffer we want to draw. Then we tell it how many vertices we want to draw. If we were using an index buffer we'd do so by calling setIndexBuffer(buffer, type) and then call drawIndexed(length). Finally we end the pass which basically tells it to draw.

Submit the command to the queue

device.queue.submit([commandEncoder.finish()]);

But we aren't quite done. We've given everything to the command encoder, now we need to actually encode. This happens when we call finish(). We'll get back a CommandBuffer which represents the serialized commands to submit to the GPU. We can take an array of these command and submit it to the device queue which will queue them up and execute them.

And with that we have all the steps to build the most basic hello world WebGPU renderer that will render a square. It's a lot of work but at least the API is more javascript ergonomic than WebGL.

I hate to end right here as we haven't really built anything but considering the length required to cover the topic it'll be another post to implement something in real code.

The Future

Should you start thinking about porting all your WebGL code to WebGPU? Well no. The web retains backwards compatibility so what's there is going to stay and WebGPU is still a ways off especially if we're talking cross-browser support (looking at you Apple). Unless you have a very high-end game (and you probably wouldn't need to read my posts) WebGPU isn't going to grant you new super-powers. But I wonder what it means for the future of the WebGL API. I can't imagine there being any taste for a WebGL 3.0 as WebGPU adds most of the missing things and in a much more javascript friendly API. I guess we'll see but for now you might start exploring to get a feel for WebGPU as it will likely wind up as the default entrypoint for 3D graphics on the web.

If you're looking for another source you can check out https://metalbyexample.com/webgpu-part-one/ which is a fantastic basic tutorial I used to bootstrap myself on the topic. You don't need to know about Metal either.