Myself and three others have been working on a tool called Allusion in our spare time: A free image organization application built for artists. It runs in Electron as a ReactJS application.
One of its key components is the image gallery. Since users may import thousands of images, we can't just render them all using pure HTML and CSS. Over the course of the development, we tried out several out-of-the-box ReactJS packages (mainly react-window and react-virtualized) but none really suited our needs - be it their design or performance.
In the end, we wrote our own super slick image gallery from scratch. It turned out quite nice, so I wanted to share our findings.
The requirements we set for ourselves:
- Keep as much as possible off the main UI thread to keep everything snappy
- Keep computation time within a few milliseconds for up to ~10.000 images
- Configurable thumbnail sizes
- Three layout modes: A simple grid, vertical (column) masonry, horizontal (row) masonry
The main caveat of our method is that it needs to know image resolutions beforehand, though it could probably be adapted to measure them on the fly too. This is what made the alternatives we tried feel clunky, so we have avoided doing that. Since we store the image dimensions in a database anyways, it's no problem for for our use-case.
Our gallery is built-up out of three main sections:
- The masonry layout algorithm itself, written in Rust
- The webworker and shared memory between the main thread and WASM
- The virtualized image renderer as a ReactJS component
Masonry algorithm in WebAssembly
Rust was was something I wanted to get into for a while already, and it's a natural fit for WASM modules.
The module is set-up with wasm-pack which outputs your WASM file along with TypeScript definitions as an easily importable ES6 module.
Transferring data
To provide the WASM package with the image dimensions it uses as input, we define a vector of Transform
structs:
pub struct Transform {
src_width: u16,
src_height: u16,
}
We chose to read the output of the layout computation from the same entry, for which we'll need some extra fields:
pub struct Transform {
src_width: u16, // input dimensions (pixels)
src_height: u16,
width: u16, // output dimensions (pixels)
height: u16,
left: u16, // output offset in the layout (pixels)
top: u16,
}
We then define a Layout
as follows:
pub struct Layout {
num_items: usize,
items: Vec<Transform>,
thumbnail_size: u16, // the desired output size
padding: u16, // the amount of pixels in between two images
}
Back in JavaScript land, we ask for a pointer to that items
vector in WASM memory, and put our image dimensions in there one by one:
impl Layout {
pub fn items(&self) -> *const Transform {
self.items.as_ptr()
}
}
import { default as init, InitOutput, Layout } from 'masonry/pkg/masonry';
const WASM = await init('masonry/pkg/masonry_bg.wasm');
const layout = Layout.new(numItems);
const ptr = layout.items_ptr();
const items = new Uint16Array(this.WASM.memory.buffer, itemsPtr, MAX_ITEMS);
async function computeLayout(images: Image[]) {
for (let i = 0; i < imgs.length; i++) {
// Every item consists of 6 uint16s
this.items![i * 6 + 0] = imgs[i].width;
this.items![i * 6 + 1] = imgs[i].height;
}
await layout.compute(); // I'll cover this method next!
// And now we can do something with the layout!
}
function getItemTransform(index: number) {
return {
width: items[index * 6 + 2], // same order as in Rust
height: items[index * 6 + 3],
left: items[index * 6 + 4],
top: items[index * 6 + 5],
};
}
At first, we allocated memory for the transforms anytime the layout is computed, but in practice, the layout is re-computed many times over. To eliminate some overhead, we just reserve a chunk of memory which we use for the lifetime of the module. With just a few megabytes we can support hundreds of thousands of images.
One extra change was necessary: The top offset easily can grow beyond the uint16
of 65,536 pixels. For rows of 4 square images of 200px each, we reach that limit after only 81 rows. That's no good. Therefore, we moved the top offsets to a separate vector of unsigned uint32
values, which will last us over 5 million of such rows.
Layout algorithms
The vertical masonry layout is my personal favourite, so that's the one I'll be covering here. It's quite simple really: We determine the amount of columns that fit within the container width given the desired column width, and then iteratively place the images in the shortest column up to that point.
impl Layout {
pub fn compute_vertical(&mut self, container_width: u16) -> u32 {
// First: Determine width of each column and initialize each column height at 0 pixels
let (col_width, mut col_heights) = {
let container_width = f32::from(container_width);
let n_columns = (container_width / f32::from(self.thumbnail_size)).round();
if n_columns == 0.0 {
return 0;
}
let col_width = (container_width / n_columns).round() as u16;
let col_heights: Vec<u32> = vec![0; n_columns as usize];
(col_width, col_heights)
};
let item_width = col_width - self.padding;
// Then loop over all images and place them in the shortest column
let (current_items, _) = self.items.split_at_mut(self.num_items);
for (item, top_offset) in current_items.iter_mut().zip(self.top_offsets.iter_mut()) {
// take into account aspect ratio for the height
item.height = ((f32::from(item.width) / f32::from(item.src_width)) * h).round() as u16;
item.width = item_width;
let shortest_col_index = col_heights
.iter()
.enumerate()
.min_by_key(|(_idx, &val)| val)
.map_or(0, |(idx, _val)| idx);
item.left = shortest_col_index as u16 * col_width;
*top_offset = col_heights[shortest_col_index];
col_heights[shortest_col_index] += u32::from(item.height) + u32::from(self.padding);
}
// Return height of longest column
col_heights.iter().max().map_or(0, |max| *max)
}
}
Performance
Now, is this any good in practice? Well, I implemented the same layout computation function in TypeScript (transpiled down to JavaScript), and measured the performance of both for a gallery of 5000 images in release mode:
It's a solid 0.2ms faster! Yeah... WebAssembly might have been a little overkill for a simple O(1) calculation like this. It might be even worse than the TS equivalent, since we need to put all of the image dimensions in a buffer first. Though, it does pave the way for a more complex layout computation (I'll link to some resources at the end) for which I'm sure it would pay off.
As for the high peaks in the WASM measurements, I'm not completely sure what causes those. I would have expected those to happen for the TS version instead, since Rust doesn't do garbage collection. I couldn't find any weird things happening in the glue code generated by wasm-pack
so I suspect it must be something from the WebAssembly runtime itself.
WebWorker with shared memory
Even though the computation only takes less than a millisecond on my machine, it might not on low-end devices or under heavy load.
By computing the layout in a WebWorker, it won't interrupt the main UI thread, meaning that the application will stay responsive.
We opted for setting up a WebWorker using com-link, mainly for its ease of use.
We don't want to copy the memory buffer every time a message is sent from the worker. Figuring out how to set up shared memory between the WASM memory in the worker and the main thread was the biggest time sink of this adventure.
At first we sent the buffer as a Transferrable but this stopped working in a recent release of Chrome. Instead, we configure the WASM memory to become a SharedArrayBuffer, which has the same capability. This is not supported out of the box: follow this guide to learn more.
// masonry.worker.ts
import { default as init, InitOutput, Layout } from 'masonry/pkg/masonry';
import { expose } from 'comlink';
export class MasonryWorker {
WASM?: InitOutput;
layout?: Layout;
items?: Uint16Array;
initializeLayout(numItems: number): Uint16Array {
this.WASM = await init('./wasm/masonry/pkg/masonry_bg.wasm');
this.layout = Layout.new(numItems);
const itemsPtr = this.layout.items();
const sharedArrayBuffer = this.WASM.__wbindgen_export_0.buffer;
this.items = new Uint16Array(sharedArrayBuffer, itemsPtr, MAX_ITEMS);
return this.items;
}
}
expose(MasonryWorker, self);
// MasonryWorkerAdapter.ts
import { Remote, wrap } from 'comlink';
import MasonryWorkerClass, { MasonryWorker } from './masonry.worker';
export class MasonryWorkerAdapter {
worker?: Remote<MasonryWorker>;
async initialize(numItems: number) {
const WorkerFactory = wrap<typeof MasonryWorker>(new MasonryWorkerClass());
this.worker = await new WorkerFactory();
this.items = await this.worker.initializeLayout(numItems);
// And now here in the main thread we can access WASM memory that was initialized in the worker!
}
}
Virtualized gallery renderer
The last step is to actually render the images in the layout that is computed. Since this is intended for a ReactJS application, the images are rendered as DOM nodes, but the same layout could also be used to render images in a canvas.
We could just put all images in the DOM since the browser is very good at rendering only whatever visible is in the viewport. We can make it lots faster though, by only putting images that are visible in the viewport in the DOM tree. This is called "virtualized rendering".
Any time the viewport dimensions change, or the user scrolls, or for any similar events, we have to re-evaluate which images to render.
const VirtualizedRenderer = ({ containerWidth, images }: VirtualizedRendererProps) => {
const layout = useMemo(() => ..., []);
const viewportRef= useRef<HTMLDivElement>(null);
const containerHeight = useMemo(() => layout.recompute(containerWidth), [containerWidth]);
// Find the top and bottom edge of the viewport in the layout (omitted for brevity: we do a binary search)
const [startRenderIndex, endRenderIndex] = determineViewportRegion(layout, viewportRef.scrollTop, viewportRef.clientHeight);
return (
// One div as the scrollable viewport
<div className={className} onScroll={handleScroll} ref={viewportRef}>
{/* One div for the content */}
<div style={{ width: containerWidth, height: containerHeight }}>
{images.slice(startRenderIndex, endRenderIndex + 1).map((im, index) => {
const fileListIndex = startRenderIndex + index;
const transform = layout.getItemLayout(fileListIndex);
return (
<img
key={im.id}
style={transform}
src={im.src}
/>
);
})}
</div>
</div>
);
};
Putting it all together, this is what we ended up with (links to a video on Imgur):
Conclusion
Computing the masonry layout runs great performance-wise. It's also much smoother while scrolling and more flexible compared to popular packages available on NPM we tried out.
Making use of WebAssembly was not really worth the hassle in the end, since the computation is fairly simple. Though, it was a good scope for a problem to learn some Rust for. Running the computation in a WebWorker makes all the difference though. Use workers, people!
There are certainly improvements to be made. You could for instance only compute the layout for the relevant section of the viewport you are in.
There are much bigger bottle necks in the code surrounding the layout computation through: It may take dozens of milliseconds to fetch thousands of images from the database and to insert their image resolutions into WASM memory. This could be solved by streaming in data as it is being fetched. For both of these it would add some unnecessary complexity for our current use case, so we're calling it a day at this point!
Resources:
- The Allusion homepage - download it for free!
- The final implementation: Masonry algorithm in Rust, Webworker, Masonry renderer which makes use of the Virtualized renderer
- Similar blogpost: Building the Google Photos image grid
Top comments (0)