The most important optimization yet
While the previous blog post's asynchronous grass generation code was a big step in the right direction, the biggest optimization yet has come: Custom vertex shading!
The problem
CPUs are fast, general-purpose processors for handling computations. GPUs are specially designed for parallel workloads (such as computing visuals for your high-res screen).
From a hardware and game loop perspective, the CPU computes the game logic every frame while the GPU renders the results. Bevy calls these two abstractions the Main World and Render World.
This CPU->GPU pipeline is not an issue until large amounts of data are being passed, where it becomes a significant bottleneck to performance. You can see this from older versions of this project: low GPU usage, but also low FPS.
In other words, misusing the CPU for large-scale parallelizable workloads instead of the GPU is a triple-threat of inefficiency: the CPU is just not as good at parallel workloads, the CPU->GPU pipeline has limited bandwidth, and, as a result, the GPU sits largely underutilized.
The solution
To move our data and logic to the GPU, we need to first understand what a shader is.
A shader is a set of instructions the GPU uses as part of rendering pixels to the screen. There are often many shaders in a rendering pipeline. A vertex shader defines how the GPU processes vertices from the game world into clip-space (roughly, the position of the vertices relative to the camera), while a fragment shader is a later stage in the pipeline that runs the final computations on a pixel's color.
We need to write a vertex shader that contains our wind simulation logic. We also need to provide that shader with any additional data it needs besides standard vertex data.
This involves uncharted territory: Writing WGSL (WebGPU Shading Language). Bevy uses WGSL as a shader language because it converts into the shader languages for each of the rendering backends (Vulkan, DX12, OpenGL, etc), providing one consistent interface. It is in WGSL that we will define the input data format as well as the shader logic for the grass, while in Bevy we will register this new pipeline and provide the data. Fortunately, we can start from Bevy's existing shader implementations instead of starting from scratch.
Additionally, Bevy thoughtfully provides an interface through which to append custom functionality onto their existing pbr (physics-based rendering) pipeline: ExtendedMaterial. Let's get to it!
The Rust side of things
ExtendedMaterial provides a means of adding additional data and functionality to any material. We are going to extend StandardMaterial by using it as the base
of an ExtendedMaterial, and writing a GrassMaterialExtension for the extension
.
First, we define the empty struct we will use to extend StandardMaterial. Here, we could pass some additional material data, but we don't have any for our use case.
#[derive(Asset, TypePath, AsBindGroup, Debug, Clone)]
pub struct GrassMaterialExtension {
}
This allows us to instantiate a new ExtendedMaterial:
let grass_material = ExtendedMaterial {
base: grass_material_std,
extension: grass_material_ext
};
Where grass_material_std
is this StandardMaterial instantiation:
StandardMaterial {
base_color: Color::WHITE,
double_sided: false,
perceptual_roughness: 1.0,
reflectance: 0.5,
cull_mode: None,
opaque_render_method: bevy::pbr::OpaqueRendererMethod::Forward,
unlit: false,
..default()
}
and grass_material_ext is just the GrassMaterialExtension instantiated:
let grass_material_ext = GrassMaterialExtension {
};
This requires updating parts of our code from old references to StandardMaterial
to ExtendedMaterial<StandardMaterial, GrassMaterialExtension>
, and PbrBundle
to MaterialMeshBundle
with the appropriate generics.
We then need to actually implement MaterialExtension
for GrassMaterialExtension
:
impl MaterialExtension for GrassMaterialExtension {
fn vertex_shader() -> bevy::render::render_resource::ShaderRef {
"shaders/grass_shader.wgsl".into()
}
fn specialize(
_pipeline: &MaterialExtensionPipeline,
descriptor: &mut RenderPipelineDescriptor,
layout: &MeshVertexBufferLayout,
_key: MaterialExtensionKey<GrassMaterialExtension>,
) -> Result<(), SpecializedMeshPipelineError> {
let mut pos_position = 0;
let mut normal_position = 1;
let mut color_position = 5;
if let Some(label) = &mut descriptor.label {
// println!("Label is: {}", label);
if label == "pbr_prepass_pipeline" {
pos_position = 0;
normal_position = 3;
color_position = 7;
}
}
let vertex_layout = layout.get_layout(&[
Mesh::ATTRIBUTE_POSITION.at_shader_location(pos_position),
Mesh::ATTRIBUTE_NORMAL.at_shader_location(normal_position),
Mesh::ATTRIBUTE_COLOR.at_shader_location(color_position),
// Mesh::ATTRIBUTE_UV_0.at_shader_location(1),
// Mesh::ATTRIBUTE_TANGENT.at_shader_location(4),
ATTRIBUTE_STARTING_POSITION.at_shader_location(17),
ATTRIBUTE_WORLD_POSITION.at_shader_location(18),
])?;
descriptor.vertex.buffers = vec![vertex_layout];
Ok(())
}
}
Above, we see that the only shader function we implement is vertex_shader
. The rest of the shader functions will be the default StandardMaterial shaders. We also implement the specialize
function, which we use to define the VertexBufferLayout
provided to the shader. To fit into the existing pbr shaders, I looked at some of Bevy's shader code. This is the default vertex struct and shader, this is the Vertex struct definition for forward rendering, and this is the Vertex struct definition for prepass.
Based off of these resources, I figured out that in forward passes, I needed to pass position, normal, and color data to 0, 1, and 5 respectively, but on prepass, I needed to pass those to 0, 3, and 7 (if you remove the if statement, an error is thrown during prepass about missing vertex data at location 7). So, in what feels like a hacky solution, I use the descriptor.label
to determine which pipeline we're in and assign the vertex buffer locations accordingly. I also pass two custom data buffers: the starting position of the vertex, and the world position of the base of the grass blade, both of which are necessary in order to simulate wind (and
We also need to add these attributes to the mesh upon generation, adding the following lines to generate_grass_geometry
:
mesh.insert_attribute(ATTRIBUTE_STARTING_POSITION, positions);
mesh.insert_attribute(ATTRIBUTE_WORLD_POSITION, grass_offsets.clone());
The shader code
Writing the shader is fairly straightforward: We're going to copy over mesh.wgsl
to our own grass_shader.wgsl
file. We modify the Vertex
struct definition to include the two additional attributes:
@location(17) starting_position: vec3<f32>,
@location(18) world_position: vec3<f32>
We add our computations to the vertex function, installing the crate bevy_shader_utils
and importing the convenient perlin_noise_2d
function it provides:
// calculation of wind and new x, y, z coords
var noise = perlin_noise_2d(vec2<f32>(vertex_no_morph.world_position.x/50.0 + globals.time * 0.5, vertex_no_morph.world_position.z/50.0 + globals.time * 0.5));
var new_x = vertex_no_morph.starting_position.x + noise * ((vertex_no_morph.position.y-vertex_no_morph.world_position.y) / 2.4);
var new_y = vertex_no_morph.position.y;
var new_z = vertex_no_morph.starting_position.z + noise * ((vertex_no_morph.position.y-vertex_no_morph.world_position.y) / 2.4);
To actually use these calculations as the new position of the vertex, we modify the line in the function defining out.world_position
to the following:
out.world_position = mesh_functions::mesh_position_local_to_world(model, vec4<f32>(vec3<f32>(new_x, new_y, new_z), 1.0));
To recap, we have inserted additional vertex data to our mesh, defined an ExtendedMaterial that allows us to implement our own vertex shader and the vertex buffer layout for it that includes this additional data, and wrote a vertex shader adapted from the existing meshes.wgsl
that includes this additional data and uses it to simulate wind on each vertex.
Finally, after removing the old CPU wind simulation logic, the impact is massive: over a million blades of fully-lit grass, all simulating wind, at over 60fps:
Final notes
My foray into shaders was fun, but not without challenges. Finding the best way to achieve what I wanted to do while fitting it into the existing complex shader logic of Bevy was tough even with ExtendedMaterial
. There are some examples to base things off of, but I still was shooting in the dark at times, mainly with my initial approach trying to write the custom vertex shader from scratch. For a while, I had it partially working, but lightning was broken. Fortunately, I finally realized I should just add onto the existing vertex shader in mesh.wgsl
, and things quickly fell into place from there.
The impact this has to the project cannot be understated. This project went from handling ~200k blades of mostly static grass to over a million blades of grass, all simulating wind, at over 60fps. It is a dramatic performance increase that has leveled up the grass from just visually serviceable to stunning and practical for gameplay.
Future optimizations for grass still exist, mainly LOD (level of detail). However, I am likely shifting my attention to more realistic terrain generation, since I recently saw an incredible video about the subject.
As always, check out the repo for the latest updates!
Top comments (0)