First off, I feel like the second part of the title is something pretty much every developer has experienced at one point. Second, obligatory GitHub repository plug:
AshtonSnapp / hasm
The Official Cellia Cross-Assembler for Modern Computers
hasm
The Homebrew Assembler. Currently supporting the 16-bit Cellia architecture and the 8-bit ROCKET88 architecture.
Each architecture supported by hasm
will be separated into its own module, although every architecture's assembler code will have the same general structure: you have a lexer which takes in files of assembly code and outputs streams of tokens which are fed into a parser which structures those tokens into a file syntax tree. Then the syntax trees are fed into a linker which tries to combine all of these trees into a single program tree, which is finally fed into a binary generator which does exactly what you think.
Right now I'm still trying to implement the assemblers for the two architectures I mentioned earlier, and I've only just now gotten to the parser. It's going to be a pain to write anything that's actually decently capable, but it'll be worth…
Did a few things since last post. First, I set up a GitHub actions workflow that will build and test the project every time someone commits. Right now, I'm the only person working on this. Sadly. Second, I've implemented a variant for instructions and an enum that contains literally every single instruction in the architecture, all 51 of them, and four different callback functions for determining what instruction a token is.
Then I had to give directives higher priority because logos
was getting confused, and I also made roughly too many regexps for addresses and immediates because of how callbacks work, and also because the architecture has 7 different addressing modes and I want to allow decimal, hexadecimal, and binary numbers for both addresses and immediates. 7 addressing modes plus two sizes for immediates, multiplied by 3 bases, means 27 regexps for addresses and immediates alone. I haven't even made the callbacks yet, which all those will do is make sure it's a valid number.
And now, I have an interesting... problem whenever I try to compile the program. This issue only popped up recently, after the previous post, meaning something changed for some god dang reason, and now I'm getting E0277: the size for values of type [u8] cannot be known at compilation time
, and the error is pointing out the logos derive bit, followed by a line inside logos::lexer::Lexer
. Why is this happening? I have no idea what I did that caused this, but whatever it was happened some time between the last post and this one!
Honestly, at this point, I should probably go over to the logos repository and make an issue because I have no idea what caused this or how to fix it. Well, until next time, have an awesome day. Because I most certainly am not having an awesome day.
Top comments (2)
Hey! Just want to mention that you're not exactly alone in creating an assembler in Rust. Yours might be a little bit more... err.... useful. But if you'd like to talk about it, I think that would be great.
The second you mentioned it I had to go look at your Github. It's certainly interesting, I've never played Kerbal Space Program but I have seen videos and know how good a game it is, and I never knew someone made a KerbalOS for the game!
The most interesting part is that you're doing it all from scratch, mostly. I tried that when I started this project, and it didn't end so well - even when it compiled, it didn't work. Hence why I've moved over to using the
logos
library. Yet your project appears to be working (and far ahead with a preprocessor and parser).I feel like I've rambled for a bit too long here. I'll be updating this series sometime today however, as I've gotten it to compile (finally!), though there are still going to be some issues that I'll have to work out, at least most likely. Good luck to your project!