DEV Community

Cover image for Porting a simple Mark-Sweep Garbage Collector to Zig
Justin Ethier
Justin Ethier

Posted on

Porting a simple Mark-Sweep Garbage Collector to Zig

I recently completed my first Zig project - a port of Bob Nystrom's simple mark-sweep garbage collector (a modern classic!) from C to Zig.

Since Zig is a relatively new language I thought it would be helpful to give an overview of the experience and talk about the tooling.

This post explains a few parts of the program but I can't do justice to Bob's original article which I highly recommend reading. His C implementation is included in the project repo and is also available on GitHub.

Anyway, off we go...

Installing Zig

The installation process is very simple. Just download the latest version from the Releases page, extract to a folder, and add to your path.

For example here is how Zig is installed for this project's continuous integration:

wget https://ziglang.org/download/0.9.1/zig-linux-x86_64-0.9.1.tar.xz
tar xf zig-linux*.tar.xz
echo "`pwd`/zig-linux-x86_64-0.9.1" >> $GITHUB_PATH
Enter fullscreen mode Exit fullscreen mode

A real installation would put the files somewhere more appropriate but this works just fine for our CI.

zig comes as a large statically-linked executable. No dependencies required đź‘Ť.

Vim Plugin

Out of the box Vim works well enough as an editor for Zig but the experience is a bit underwhelming. So I installed the Vim configuration for Zig plugin get syntax highlighting.

As an unexpected bonus:

This plugin enables automatic code formatting on save by default using zig fmt.

Much like with Go its super handy to be able to just not worry about code formatting. On top of that zig fmt also catches basic syntax errors:

Image description

I tend to save frequently when coding so this provides constant feedback on the state of the code.

Printing

I just used debug printing to write output:

const print = @import("std").debug.print;
print("Collected {} objects, {} remaining.\n", .{ numObjects - self.numObjects, self.numObjects });
Enter fullscreen mode Exit fullscreen mode

An interesting observation is that instead of C varargs Zig uses an anonymous struct.

While Loops, If's, Optionals, and More

Zig has many improvements over standard C syntax.

Increment as part of the loop:

while (i < self.stack_size) : (i += 1) {
Enter fullscreen mode Exit fullscreen mode

Unbox optionals:

while (object.*) |obj| {
if (object.data.pair.head) |head| {
Enter fullscreen mode Exit fullscreen mode

Handle error unions:

var a = vm.pushPair() catch unreachable;
Enter fullscreen mode Exit fullscreen mode

Other miscellaneous improvements over C include standard naming conventions (underscores in names for non-callable variables) and a cleaner boolean type.

It is also helpful to define functions as part of a struct type, allowing a more natural organization of code. Though this has been a standard feature in most languages for a long time now. Including C++.

Pointers

The sweep function uses a pointer-to-pointer to walk the linked list of all objects and unlink unused objects:

void sweep(VM* vm)
{
  Object** object = &vm->firstObject;
  while (*object) {
    if (!(*object)->marked) {
      /* This object wasn't reached, so remove it from the list and free it. */
      Object* unreached = *object;

      *object = unreached->next;
      free(unreached);

      vm->numObjects--;
    } else {
      /* This object was reached, so unmark it (for the next GC) and move on to
       the next. */
      (*object)->marked = 0;
      object = &(*object)->next;
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Perhaps surprisingly, this function can be expressed line for line in Zig, just with a fresh new syntax.

fn sweep(self: *VM) void {
    var object = &(self.first_object);
    while (object.*) |obj| {
        if (!obj.marked) {
            // This object wasn't reached, so remove it from the list and free it.
            var unreached = obj;

            object.* = obj.next;
            self.allocator.destroy(unreached);

            self.num_objects -= 1;
        } else {
            // This object was reached, so unmark it (for the next GC) and move on to
            // the next.
            obj.marked = false;
            object = &(obj.next);
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Testing

All of the tests were ported over to test declarations. These can be invoked via zig test mark-sweep.zig:

test "test 1" {
    const allocator = std.testing.allocator;
    print("Test 1: Objects on stack are preserved.\n", .{});

    var _vm = try VM.init(allocator);
    var vm = &_vm;
    try vm.pushInt(1);
    try vm.pushInt(2);

    vm.gc();

    try std.testing.expect(vm.num_objects == 2);
    vm.deinit();
}
Enter fullscreen mode Exit fullscreen mode

Another important point! In Zig one does not call malloc to allocate memory. Instead one of many allocators is used to allocate memory in the way that is most appropriate for the situation at hand. Our code is organized like a typical library where any allocator may be passed in and retained for memory allocations over the life of a VM instance.

For testing we use testing.allocator which fails the test case if any memory is leaked. This enhances our test as vm.deinit() removes all of the root objects and performs a collection. If our code is working properly all allocated memory will be freed.

Debugging with gdb

Having debugged C extensively gdb I was shocked that it Just Works for debugging Zig code:

justin@ubuntu:~/Documents/zig-mark-sweep-gc$ make
zig build-exe mark-sweep.zig
justin@ubuntu:~/Documents/zig-mark-sweep-gc$ ./mark-sweep 
Performance Test.
Collected 20000 objects, 0 remaining.
justin@ubuntu:~/Documents/zig-mark-sweep-gc$ gdb mark-sweep 
GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from mark-sweep...
(gdb) break VM.sweep
Breakpoint 1 at 0x23091c: file /home/justin/Documents/zig-mark-sweep-gc/mark-sweep.zig, line 91.
(gdb) run
Starting program: /home/justin/Documents/zig-mark-sweep-gc/mark-sweep 
Performance Test.

Breakpoint 1, VM.sweep (self=0x7fffffffd938) at /home/justin/Documents/zig-mark-sweep-gc/mark-sweep.zig:91
91              var object = &(self.first_object);
(gdb) n
92              while (object.*) |obj| {
(gdb) 
Enter fullscreen mode Exit fullscreen mode

Conclusion

That's it for now. All in all this was a great first experience with Zig and I look forward to writing more Zig in the near future.

Useful links:

Top comments (0)