Introduction
The recent, hopefully temporal, death of Actix-web forced me to recheck some code of a pet project of mine that was not cleanly decoupled, some of my Diesel was mixed with my Actix, so I will share the process and stages of decomposition in Rust.
This is thoroughly explained in The Book, but I'll try to make it more concise and simple, I'll skip many details in favor of clarity and of course the code itself is not useful, its only purpose is to show as simple as I could, the process step by step. I hope is helpful.
I also uploaded a repo in github (https://github.com/robertorojasr/rust-split-example), if you clone it, each commit is another step in the process. And unlike this example, it runs (doesn't do much but runs)
This evolution should look like this:
- Single file with everything on it.
- Move part of your code to a different module. (this step is missing in the repo, I forgot, sue me), please don't sue me.
- Move the module to a different file.
- Turn the module in a single file into a folder acting as a module with multiple files as sub-modules.
- Split your crate in a library and an executable that lives in the same directory tree.
- Move the executable and library to different crates (workspaces) with their own directory tree.
When our project gets bigger and more complex this strategies will help to keep the overall structure of your code manageable, clear and decoupled.
0. Starting point
So you make you "Hello World" and everything is cool, you replace that println and start coding, soon you notice your lovely main.rs is getting bigger, filled with structs, functions and traits, some of them are link to each other but some are not, like Neo watching the Matrix you start to see some macro-structure and you tell to yourself, "myself, is time to tide this mess", of course you could design all before you start, (you should actually), but you want avoid overengineering and as this post shows (at least hopes to) is not necessary to take the big guns from the start.
Lets say this is your current masterpiece, impressive, I know, I did it all by myself.
// src/main.rs
struct A {
a: i32,
}
struct B {
b: i32,
}
fn main() {
let first = A { a: 42, };
}
1. A module
So how do you start?, the first thing would be make modules to encapsulate some code
// src/main.rs
mod something {
struct A {
a: i32,
}
struct B {
b: i32,
}
}
fn main() {
let first = A { a: 42, };
}
but now there is a problem, even tho A is defined in the same file, is no longer in the same module, so main() have no idea something is in there, so we have to import something
// src/main.rs
mod something {
struct A {
a: i32,
}
struct B {
b: i32,
}
}
use crate::something::*; // <- this is new
fn main() {
let first = A { a: 42, };
}
Now main() knows that something is here and that it can use everything public in there, but wait, there is nothing public in there, so we have to make public whatever we want to make visible to main().
// src/main.rs
mod something {
pub struct A { // <- this is new
pub a: i32, // <- this is new
}
pub struct B {
pub b: i32, // <- this is new
}
}
use crate::something::*;
fn main() {
let first = A { a: 42, };
}
Now main() can see everything inside, of course you don't have to make all public, just what you need, and spare me the OOP argument that you shouldn't expose data, is just an example here, and OOP is not all there is BTW (a bit of sass here).
2. A module in other file
But that doesn't solve the fact that your IDE gets laggy with your still huge file, in fact you just added more stuff!, to solve this you should start using Vim... I joking... (Am I?)
So now you want to move that module outside, to other file, to live free and alone.
So you keep the definition and the import in main.rs
// src/main.rs
mod something;
// the content of the module was here
use crate::something::*;
fn main() {
let first = A { a: 42, };
}
and put the module content in your freshly made new file in the same folder as main.rs
// src/something.rs
pub struct A {
pub a: i32,
}
pub struct B {
pub b: i32,
}
When you added the module something in main.rs
// src/main.rs
mod something;
// the rest of it
Rust automagically looks for it inside the file, if doesn't find it, looks for a file with the module name in the same folder (in this case src/) and if still doesn't find it looks for a folder with the module name and a file mod.rs inside, there it looks for the code.
3. A module in a folder with many submodules
As mentioned in the last paragraph, we can split even more our module,
all we have to do is make a folder something so we get:
src/
|_ main.rs
|_ something.rs
|_ something/
Now, we could just rename something.rs as mod.rs and move it inside something/ but what's the point in that, we want to split things!, so we are gonna give A and B, (please don't name your stuff like that outside examples like this) their own modules. So we'll have this tree
src/
|_ main.rs
|_ something/
|_ mod.rs
|_ a.rs
|_ b.rs
But what happen with something.rs, well my friend, you split it, A goes to a.rs and you can guess where B went.
Now A and B are in their own modules so we modify the imports accordingly
// src/main.rs
use crate::something::a::*; // <- this is new
use crate::something::b::*; // <- this is new
fn main() {
let first = A { a: 42, };
}
But mod.rs has now the responsibility to call their children, as I told before, when Rust check for something.rs and doesn't find it will check for the folder something and then inside look for a file named mod.rs.
// src/something/mod.rs
pub mod a;
pub mod b;
You may notice that this is the same thing we did with main at first. You can keep nesting modules as long as you like, just like that.
// src/something/a.rs
pub struct A {
pub a: i32,
}
// src/something/b.rs
pub struct B {
pub b: i32,
}
Fine and dandy, but now what? what if my project is a huge beast, what if I could reuse some of the code, let's say. You had a nice webapp with Diesel ORM and Actix-web on top, what is some day, let say Actix creator leaves and it's future is uncertain, what then huh?, what then?!
...
Well, you could see your webapp as a library that deals with the DB, lets say with Diesel and a separate consumer in an executable with Actix-web or other framework that is not apparently dead, but maybe comes back ...
You could also make a CLI UI for example that will use the same DB related codebase. Lets do it!
4. Rust library with an executable file
As you may have read and skip to more fun stuff, Rust recognize 2 kinds of crates (the official name of what I've been calling project, just because I'm a rebel) libraries and executables, you probably know the difference but for completion let put it simple, an executable is something you use directly and a library is something that is used by an executable.
What we want in this case is to put our main() in another file as an executable and leave all of our structs in a library for future reuse in other executables.
Rust tries really hard to make things easy (because it feels guilty for all the suffering with burrows and lifetimes) so to make a crate a lib you just rename the file main.rs to lib.rs, voilรก, now is a library, but lets make something useful with it, to do that, we'll make a new folder bin/ and we will copy our existing main.rs, (now renamed lib.rs) on it with some fancy and descriptive name, leaving you with this new tree:
src/
|_ lib.rs // <- just a renamed main.rs
|_ bin/ // <- this folder is new
|_ |_ framework_that_broke_my_heart.rs // <- a copy of our ex-main.rs
|_ something/
|_ mod.rs
|_ a.rs
|_ b.rs
so, everything inside something/ will be untouched now, first we'll work in framework_that_broke_my_heart.rs. The first thing will notice is that everything inside bin/ is in a bubble universe, even tho is inside our crate is not part of it anymore, think it like us and society (not you, well adjusted programmers..ugh), so we have to call our newly created library (you know, when we renamed main.rs to lib.rs) just as we where calling any library.
// src/bin/framework_that_broke_my_heart.rs
extern crate this_example; // oh right, I never named
// this crate, is the name you give in Cargo.toml
// under [package] in tha *name* field
// (don't use dashes on it)
use crate::something::a::*;
use crate::something::b::*;
fn main() {
let first = A { a: 42, };
}
and in lib.rs ex-main.rs
// src/lib.rs
pub mod something; // <- this is all, is like telling Rust
// copy/paste everything inside `something` inside a `mod` here
Everything inside something is untouched.
We are close to the finish line, by now you may have wondered about the situation of that poor fellow whose heart was broken by the early demise of his favorite framework, and telling yourself, how the F* did he put all his Actix-web code inside a tiny little file, that file must be huge, a huge mess, but the whole idea was to split things and he just made everything worse!, well my fellow, this is when the next point comes.
5. Workspaces
This turns you crate in smaller crates inside a big crate-ish umbrella, you could just split your code in 2 crates by now, after all, we claimed that the executable in the later step was already not part of the original crate and that's right. Spliting entirely the code in 2 separate crates is a valid choice, but many of the dependencies are common to both: the executable and the library; and it would be annoying to rebuild 2 times, test 2 times, etc. If both the library(es) and executable(s) are related to each other, you may want to treat them as 1 thing for building/testing/running purposes, you may also keep them both in the same repo.
This one is gonna get a bit more complicated but not much.
We are gonna make 2 crates inside our original one and glue them together.
So we had this beauty:
./
|_Cargo.toml
|_Cargo.lock
|_target/
| |_ ... // we don't care about this, is made in the building process
|
|_src/
|_ lib.rs
|_ bin/
|_ |_ framework_that_broke_my_heart.rs
|_ something/
|_ mod.rs
|_ a.rs
|_ b.rs
Sitting in ./ just next to the "Cargos" we just make two new crates.
$ cargo init --lib db_stuff
$ cargo init ftbmh // framework_that_broke_my_heart, too long,
// too lazy, again, this is an example, name your thing with
// common sense, don't be // funny in a real project, the fun
// will last about 10min, the pain much // more than that.
The argument --lib the only think it does, is instead to make a main.rs makes a lib.rs, by default makes a executable.
So we will get:
./
|_Cargo.toml
|_Cargo.lock
|_target/
| |_ ...
|
|_src/
| |_ lib.rs
| |_ bin/
| |_ |_ framework_that_broke_my_heart.rs
| |_ something/
| |_ mod.rs
| |_ a.rs
| |_ b.rs
|
| // ^ that's the old part
|
|_db_stuff/ // this whole folder is new
| |_Cargo.toml
| |_src/
| |_ lib.rs // <- that's all the --lib does
|_ftbmh/ // this whole folder is new
|_Cargo.toml
|_src/
|_ main.rs
Now, from the original Cargo.toml we will move the parts as necessary to the new Cargo.toml, for example the dependencies to whoever needs them.
When you are done with that, just clean you good old Cargo.toml and just put this:
[workspace]
members = ["db_stuff", "ftbmh"]
That's it, now the original Cargo.toml doesn't have a [package] or [dependencies] section; the original crate is now a shell.
When you made the 2 crates inside (dn_stuff and ftbmh) cargo saw that you where inside an existing crate with its own git repo so didn't made one for them, your old repo is still good and healthy.
Now remember that you splited your code and probably one part depends on others, in this case ftbmh depends on db_stuff so we have to add that dependency in the ftbmh Cargo.toml file
// ftbmh/Cargo.tom
[package]
// your stuff
[dependencies]
// your dependencies
db_stuff = { path = "../db_stuff" } // as you may know the `..`
// in the path refers to the mother folder of the current one
ftbmh/main.rs already was outside the original crate as you may remember from the last step, so it's all done there and the crate db_stuff was used as an external crate already so everything is the same there too.
Conclusion
We are done. What started as a simple single file project is now a complex crate with 2 workspaces, of course those could be 3 or 100, just rinse and repeat and the same for modules that are the namespaces of Rust.
As you can see, if your code is properly decoupled, the whole process is very unobtrusive, of course is good to have the design planned from the start, but sometimes projects grow more than we thought, they get more complex and is good to have ways to easily adapt it without making a mess, is also good because you don't need to over engineer your solution afraid of a future operations like the one exposed here. You can grow your code organically.
I hope this help someone, there is nothing new here, but I found that it was too disperse in the documentation and books. There is a lot more of it, I didn't talk much about the public/private and what is visible for whom by default, that is very well explained in both The Book and the amazing O'Reilly "Programming Rust", I just tried to make a scaffold to make easy hanging the details later.
Any corrections and suggestion, feel free to let me know specially if something is weirdly written, English is not my native language, this is my best for now.
Latest comments (5)
some edits to section 3...
Thanks for reducing my pain.
This is exactly what I was trying to figure out. Thanks for the entertaining read as well!
Very well put together, thanks!
Thank you, this was easier to understand than the official docs when I just wanted to split my code into multiple files