Hi folks, I've been really struggling to keep up the blog recently. This was in part to a change of scenery in Berlin. But als I've learned that writing continuously on the same subject for so long doesn't get kind of boring. That said, I'm gonna push through today and summarize the key points about threading in Rust.
Yesterday's questions answered
No questions to answer
Today's open questions
No open questions
Threading 101
A multi-threaded program can run code in a concurrent or parallelised way. This can increase the performance of your program at the cost of increasing its complexity. In older programming languages, threading also made debugging code more difficult: threading environments are hard to reproduce (as thread creation is usually managed by the OS) and understanding what a specific thread is doing can be impossible to tell without an advanced debugger.
Rust aims to solve some of the complexities via compiler rules that will make thread unsafe code fail at the compile step. This is achieved through the use of ownership guards and atomic reference counters which we'll cover later.
The API
We can create a thread really easily in Rust:
use std::thread;
use std::time::Duration;
fn main() {
thread::spawn(|| {
for i in 1..10 {
println!("hi number {} from the spawned thread!", i);
thread::sleep(Duration::from_millis(1));
}
});
}
By default the main thread will not wait for subthreads to exit before continuing. This is why we need to use a join handle
to force the main thread to wait. As this returns a Result
, we also unwrap it. Note that where you do this in your code will impact the runtime behaviour:
use std::thread;
use std::time::Duration;
fn main() {
let handle = thread::spawn(|| {
for i in 1..10 {
println!("hi number {} from the spawned thread!", i);
thread::sleep(Duration::from_millis(1));
}
});
handle.join().unwrap();
}
What do threads own?
In the above example, we can see that threads take a closure as their first argument. If we think back, we'll remember that closures are special in Rust: they capture variables declared in their scope. Capturing actually means borrowing in ownership terms:
use std::thread;
use std::time::Duration;
fn main() {
let my_vec = vec![1, 2, 3];
let handle = thread::spawn(|| {
for i in 1..10 {
println!("my vec: {:?}", my_vec);
}
});
handle.join().unwrap();
}
The compiler will reject this code as we don't know how long a thread's lifetime will be. This means that a thread could outlive the duration of my_vec
, leading to an invalid reference being used in a thread.
We can move
the vector into the thread. However, we won't be able to reference the value anywhere after our instantiation of the thread, as it will no longer be in scope.
Chatty threads
Rust implements the concept of channels
as a way to pass messages between threads. This is considered more robust that manipulating shared state. A channel consists of at least one producer and one receiver:
let (tx, rx) = mpsc::channel();
When sending messages between threads it's important to remember two things:
- Sending stuff also moves ownership of the data to the receiving thread
- You can force a thread to wait for a message with the
recv
. You can usetry_recv
to check if a message is available or not.
The first point is a great source of robustness for your Rust programs. Channels are a really safe way to pass around values as you know the compiler won't let you accidentally borrow data.
Sharing state
So far our examples have only covered situations where we create one thread. If we want to create multiple threads, we also need to use more advanced tools to manage ownership of shared resources among them. To be able to move
ownership of data into multiple threads, we need to use combination of the Arc
and Mutex
type.
Mutually exclusive
The Mutex
type is a was to guarantee singular access to a resource via a lock. When you aren't working with threads, Mutex
can be used to access a value that it wraps in a different scope where it isn't owned and manipulate that value.
If we use a Mutex
with threading, things get a bit more complicated. As we have to move
values into a thread, the Mutex
would end up having multiple owners. In a previous blog we looked at the Rc
(reference counter) type. This isn't thread safe. Rc
are cheaper than the thread safe Arc
(atomic reference counter) type which will use here.
By wrapping our Mutex
in an Arc
, we instruct Rust to track all the owners of the Mutex
. We can call the Arc
's clone
method with a reference of our Mutex
to create multiple references to it. Deref coercion then helps us write clean code to dereference the reference and manipulate the underlying value.
Send and Sync
If you want data to be transferrable between threads, they have to implement the Send
and Sync
traits. These are the overheads I mentioned before. Implementing your own types with Send
and Sync
results in unsafe Rust. This is a topic I'll touch on a later blog.
Top comments (0)