Building a stack-based virtual machine, part 5 - the machine

#rust #language #theory #computerscience

In the last episode we built a code generator and a program representation so we now have all but that last bit we need to run our virtual machine.

Introducing the machine

So far we've built all the primary structures required to assemble our program, we have code, instructions and a stack. The last major thing we need to do before we can run our programs is the machine itself. Like Captain Planet we're going to assemble our constituent parts into a greater whole.

Here's our machine:

pub struct Machine<'a, T> {
  code: Code<T>,
  instruction_table: &'a InstructionTable<T>,
  ip: usize,
  operand_stack: Stack<T>
}

impl<'a, T> Machine<'a, T> {
  pub fn new(code: Code<T>, instruction_table: &'a InstructionTable<T>) -> Machine<'a, T> {
    Machine {
      code: code,
      instruction_table: instruction_table,
      ip: 0,
      operand_stack: Stack::new()
    }
  }
}

So when we create our machine we pass in our code and our instruction table. It sets the instruction pointer to zero (ie the first instruction in the code), creates an empty stack and returns our fancy new machine.

Before we can run any code we need to give it some handy helper methods for manipulating the operand stack and retrieving data from the data section. These are used by our instructions because if you remember they receive a mutable reference to the machine as their first argument.

impl<'a, T> Machine<'a, T> {

    pub fn operand_push(&mut self, value: T) {
        self.operand_stack.push(value);
    }

    pub fn operand_pop(&mut self) -> T {
        self.operand_stack.pop()
    }

    pub fn get_data(&self, idx: usize) -> &T {
        self.code
            .data
            .get(idx)
            .expect(&format!("Constant data is not present at index {}.", idx))
    }

}

No we can finally go ahead and add our run method to the machine. It's a reasonable chunk of code so let's break it down a bit first into some basic ideas:

Keep looping until the instruction pointer (IP) is the same as the length of the program (ie we've run out of program).
Read the opcode from our program at IP.
Read the arity from our program at IP + 1.
From 0 to arity read argument indexes from the program and push them into an arguments vector.
Call the instruction with the arguments.

impl<'a, T> Machine<'a, T> {

    pub fn run(&mut self) {
        loop {
            if self.ip == self.code.code.len() {
                break;
            }

            let op_code = self.next_code();
            let arity = self.next_code();

            let instr = self.instruction_table.by_op_code(op_code).expect(&format!(
                "Unable to find instruction with op code {}",
                op_code
            ));

            let mut args: Vec<usize> = vec![];

            for _i in 0..arity {
                args.push(self.next_code());
            }

            let fun = instr.fun;
            fun(self, args.as_slice());
        }
    }

    fn next_code(&mut self) -> usize {
        let code = self.code.code[self.ip];
        self.ip = self.ip + 1;
        code
    }

}

Now we have enough infrastructure in place to run our first program!

We're going to make a really simple arithmetic machine because I can do that without making this blog post too long. We need two instructions; Push which takes data and pushes it to the stack and Add which pops two operands off the stack, adds them together and pushes the result back onto the stack:

fn push(machine: &mut Machine<isize>, args: &[usize]) {
    let arg = *machine.get_data(args[0]);
    machine.operand_push(arg);
}

fn add(machine: &mut Machine<isize>, _args: &[usize]) {
    let rhs = machine.operand_pop();
    let lhs = machine.operand_pop();
    machine.operand_push(lhs + rhs);
}

fn instruction_table() -> InstructionTable<isize> {
    let mut it = InstructionTable::new();
    it.insert(Instruction::new(0, "push", 1, push));
    it.insert(Instruction::new(1, "add", 0, add));
}

Next we need our program, it's going to add 2 and 3 together:

fn build_program(it: &InstructionTable<isize>) -> Code<isize> {
    let mut builder = Builder::new(&it);
    builder.push("push", vec![2]);
    builder.push("push", vec![3]);
    builder.push("add", vec![]);
    Code::from(builder)
}

Now let's run it:

#[test]
fn addition_example() {
    let it = instruction_table();
    let code = build_program(&it);
    let mut machine = Machine::new(code, &it);
    machine.run();
    let result = machine.operand_pop();
    assert_eq!(result, 5);
}

And that's it. We've run our very first program in our virtual machine. Of course it's not very useful without function calls, so we'll add them in the next episode.

Top comments (2)

scprotz • Jul 26 '22

Just a note. I made it this far just copy/pasting code, but the code up to this point is missing a few things (that are in the final source). There is no push_data() in Builder yet. And InstructionTable does not have symbols. I'll double check again and try to make sure I didn't overlook something, but just wanted to point those out in case someone (like me) tries to work through the blog as a tutorial.

James Harton • Aug 4 '22

Thanks for the note. I wrote this a few years ago and honestly I don’t even remember it very well but I’ll take a look at it and see what needs editing.

DEV Community

Building a stack-based virtual machine, part 5 - the machine

Introducing the machine

Top comments (2)

Read next

Interview with Eson (Seven), Creator of DocKit!

Building a CHIP-8 Emulator in Rust - An Advanced Adventure

Using Polars in Rust for high-performance data analysis

Implement React v18 from Scratch Using WASM and Rust - [27] Implement useTransition