I recently started learning Rust using the awesome CodeCrafters website.
CodeCrafters is one of the best ways to learn the language and the underlying technology, by guiding you to build existing software from scratch.
When I completed the Build your own Interpreter challenge I realized that I left behind some pretty messy code:
fn run_file(&self, args: &Vec<String>) {
let command = &args[1];
let filename = &args[2];
match command.as_str() {
"tokenize" => {
let file_contents = fs::read_to_string(filename).unwrap_or_else(|_| {
writeln!(io::stderr(), "Failed to read file {}", filename).unwrap();
String::new()
});
let mut scanner = Scanner::new(file_contents);
scanner.scan_tokens();
for token in scanner.tokens {
println!("{}", token);
}
if unsafe { HAD_ERROR } {
exit(65);
}
}
"parse" => {
let file_contents = fs::read_to_string(filename).unwrap_or_else(|_| {
writeln!(io::stderr(), "Failed to read file {}", filename).unwrap();
String::new()
});
let mut scanner = Scanner::new(file_contents);
scanner.scan_tokens();
let tokens = scanner.tokens.into_boxed_slice();
let mut parser = Parser::new(tokens);
let expr = parser.parse_expression();
if unsafe { HAD_ERROR } {
exit(65);
}
let mut ast_printer = AstPrinter::new();
println!("{}", ast_printer.print(expr.as_ref().unwrap()));
}
"evaluate" => {
let file_contents = fs::read_to_string(filename).unwrap_or_else(|_| {
writeln!(io::stderr(), "Failed to read file {}", filename).unwrap();
String::new()
});
let mut scanner = Scanner::new(file_contents);
scanner.scan_tokens();
let tokens = scanner.tokens.into_boxed_slice();
let mut parser = Parser::new(tokens);
let expr = parser.parse_expression();
if unsafe { HAD_ERROR } {
exit(65);
}
let mut interpreter = Interpreter::new();
interpreter.interpret_expression(expr.as_ref().unwrap());
if unsafe { HAD_RUNTIME_ERROR } {
exit(70);
}
}
"run" => {
let file_contents = fs::read_to_string(filename).unwrap_or_else(|_| {
writeln!(io::stderr(), "Failed to read file {}", filename).unwrap();
String::new()
});
let mut scanner = Scanner::new(file_contents);
scanner.scan_tokens();
let tokens = scanner.tokens.into_boxed_slice();
let mut parser = Parser::new(tokens);
let statements = parser.parse();
if unsafe { HAD_ERROR } {
exit(65);
}
let mut interpreter = Interpreter::new();
interpreter.interpret(statements);
if unsafe { HAD_RUNTIME_ERROR } {
exit(70);
}
}
_ => {
writeln!(io::stderr(), "Unknown command: {}", command).unwrap();
return;
}
}
}
Looking at the different cases, it's easy to spot recurring implementations.
Before we dive into it... The "Crafting Interpreters" challenge walks you through building — just as the name suggests — an interpreter. I won’t go into the specifics of the interpreter itself, but for this post, it’s good to know the overall structure:
- Read the source code into a string
- Scan tokens
- Parse tokens (build an AST)
- Evaluate expressions and statements
The code above follows roughly the same steps. Let’s break down the structure of each match
arm:
tokenize
read file
scan
print tokens
parse
read file
scan
parse expression
print AST
evaluate
read file
scan
parse expression
interpret expression
run
read file
scan
parse
interpret
See the pattern? Let's say something changes in how files are read, or you want to add some new parameter to the Scanner
's constructor. You’d have to update every line where the scanner is used. In such a small codebase, it's not a huge issue, but in a larger software project, this would be problematic. That’s why it’s a good idea to develop a habit of refactoring early. Let’s apply the Extract Method refactoring pattern.
First, I moved the file-reading logic into a separate function.
I also changed the filename
parameter type from &String
to &str
.
Passing
&str
is more direct and efficient approach. AString
automatically dereferences to&str
, but passing a&str
where a&String
is expected would require an explicit conversion.
fn read_file(&self, filename: &str) -> String {
let file_contents = fs::read_to_string(filename).unwrap_or_else(|_| {
writeln!(io::stderr(), "Failed to read file {}", filename).unwrap();
String::new()
});
return file_contents;
}
Next, I extracted the tokenization logic since it's used in every case:
fn tokenize(&self, filename: &str) -> Vec<Token> {
let source = self.read_file(filename);
let mut scanner = Scanner::new(source);
scanner.scan_tokens();
return scanner.tokens;
}
Both "parse"
and "evalutate"
call parser.parse_expression()
, so I extarcted.
Notice that instead of re-implementing file reading and tokenization, this function simply calls self.tokenize(filename)
, reusing the extracted logic. Refactoring is already paying off here, and it helps keep your codebase maintainable.
fn parse_expression(&self, filename: &str) -> Option<Expr> {
let tokens = self.tokenize(filename);
let tokens = tokens.into_boxed_slice();
let mut parser = Parser::new(tokens);
return parser.parse_expression();
}
Though parse
is only used once, it’s still a good habit to extract it into its own function:
fn parse(&self, filename: &str) -> Vec<Statement> {
let tokens = self.tokenize(filename);
let tokens = tokens.into_boxed_slice();
let mut parser = Parser::new(tokens);
return parser.parse();
}
Here’s the final version of the run_file
function. Although the size hasn't changed much, it's now cleaner and easier to maintain:
fn run_file(&self, args: &Vec<String>) {
let command = &args[1];
let filename = &args[2];
match command.as_str() {
"tokenize" => {
let tokens = self.tokenize(filename);
for token in tokens {
println!("{}", token);
}
if unsafe { HAD_ERROR } {
exit(65);
}
}
"parse" => {
let expr = self.parse_expression(filename);
if unsafe { HAD_ERROR } {
exit(65);
}
let mut ast_printer = AstPrinter::new();
println!("{}", ast_printer.print(expr.as_ref().unwrap()));
}
"evaluate" => {
let expr = self.parse_expression(filename);
if unsafe { HAD_ERROR } {
exit(65);
}
let mut interpreter = Interpreter::new();
interpreter.interpret_expression(expr.as_ref().unwrap());
if unsafe { HAD_RUNTIME_ERROR } {
exit(70);
}
}
"run" => {
let statements = self.parse(filename);
if unsafe { HAD_ERROR } {
exit(65);
}
let mut interpreter = Interpreter::new();
interpreter.interpret(statements);
if unsafe { HAD_RUNTIME_ERROR } {
exit(70);
}
}
_ => {
writeln!(io::stderr(), "Unknown command: {}", command).unwrap();
return;
}
}
I am still new to Rust, so feel free to share any feedback or leave a comment with your thoughts or questions.
More info on Extract Method. If you prefer books, I recommend the following two:
- Martin Fowler: Refactoring – This book covers all the core refactoring techniques.
- Michael Feathers: Working Effectively with Legacy Code – great resource for working with legacy codebases.
For the complete source code, check out my github repo.
Top comments (0)