Yassin Eldeeb 🦀

Posted on Feb 21, 2022

Mutex in Rust can be tricky sometimes...

#rust #tricky #mutex #multithreading

Can you guess why snippet 1 is outperforming by 400% than snippet 2 🤯

So, this is a thread pool implementation for a simple HTTP server using an mpsc channel to add incoming requests' handlers to a queue then threads can receive the sent closures (request handlers) and execute them in parallel.

Snippet 1:

type Job = Box<dyn FnOnce() + Send + 'static>;

struct Worker {
    id: usize,
    thread: JoinHandle<()>,
}

impl Worker {
    fn new(id: usize, receiver: Arc<Mutex<Receiver<Job>>>) -> Worker {
        let thread = thread::spawn(move || loop {
            // here 👇
            let job = receiver.lock().unwrap().recv().unwrap();
            // here 👆
            println!("executing on thread: {}", id);

            job();
        });

        Worker { id, thread }
    }
}

Snippet 2:

type Job = Box<dyn FnOnce() + Send + 'static>;

struct Worker {
    id: usize,
    thread: JoinHandle<()>,
}

impl Worker {
    fn new(id: usize, receiver: Arc<Mutex<Receiver<Job>>>) -> Worker {
        let thread = thread::spawn(move || loop {
            // here 👇
            let lock = receiver.lock().unwrap();
            let job = lock.recv().unwrap();
            // here 👆
            println!("executing on thread: {}", id);

            job();
        });

        Worker { id, thread }
    }
}

Benchmarks

Tested using Artillery on a CPU that has 16 logical processors and checked the graphs to confirm that all threads are indeed running in parallel and not concurrently.

Look at the spikes 👇

Snippet 1:

Snippet 2:

Really? a single line can make that difference? aren't they the same though 🤔

Let's discuss a few things first to understand where is the problem 👨‍🏫

mpsc from its name is "multiple producers, single consumer", in this case, we want a single producer(main thread) to add requests' handlers to the queue and we want multiple consumers so that multiple threads can consume messages in parallel.

So to achieve our goal we wrapped the receiver we got from mpsc in an Arc which gives us multiple ownership for the receiver to pass to the different threads to work with and to make it that only one thread is responsible for executing a handler we used Mutex which allows only one thread to access the receiver at any given time, because we don't want two threads to respond for a given request.

To access the data in a mutex, a thread must first signal that it wants access by asking to acquire the mutex’s lock. This lock keeps track of who currently has exclusive access to the data and the lock is released when the MutexGuard is dropped.

Notice in snippet 1:

// -------⬇⬇⬇⬇Temporary Value⬇⬇⬇⬇
let job = receiver.lock().unwrap().recv().unwrap();

✨the trick here is: using let, any temporary values used in the expression on the right-hand side of the equals sign are immediately dropped when the let statement ends!

So that means that the lock is released before the thread even starts executing the handler, which means that other threads can acquire the lock to receive incoming requests' handlers to process them 💪

As you see here we're running 16 threads in parallel which is freaking awesome!

executing on thread: 0
executing on thread: 1
executing on thread: 2
executing on thread: 5
executing on thread: 4
executing on thread: 3
executing on thread: 10
executing on thread: 7
executing on thread: 13
executing on thread: 9
executing on thread: 6
executing on thread: 11
executing on thread: 12

On the other hand with snippet 2:

    let thread = thread::spawn(move || loop {
       let acquired_lock= receiver.lock().unwrap();
       let job = acquired_lock.recv().unwrap();

       println!("executing on thread: {}", id);

       job();
    } // `acquired_lock` is dropped here and lock is released.
 );

the lock is released after the execution of the handler, specifically at the end of the scope where It was defined.

Therefore the threads, in this case, are running concurrently and not in parallel, cause each thread has to wait for the other thread who acquired the lock to release it so that it can read the sent closures from the receiver.

And actually the current lock acquirer probably will pick it up again faster than other threads cause it's like:

acquire lock.
release lock.
next iteration of the loop.
acquire lock.
...

which finally leads to single-threaded execution even if there are threads spawned and waiting in the pool, As shown below 👇

executing on thread: 0
executing on thread: 0
executing on thread: 0
executing on thread: 0
executing on thread: 0
executing on thread: 0
executing on thread: 0
executing on thread: 0
executing on thread: 0
executing on thread: 0

Very tricky right? 😅

The lesson here is, to be careful when dealing with Mutex cause the compiler can't guarantee the desired threading behavior you wanna achieve such as this scenario or other scenarios like deadlocks which are so much worse.

That's why I've included a picture of Ferris that means that "This code compiles but does not produce the desired behavior." from the rust lang book

Without the logs, I wouldn't even notice that there was a problem 🤷‍♂️

Hope you enjoyed it 🤗

DEV Community

Mutex in Rust can be tricky sometimes...

Can you guess why snippet 1 is outperforming by 400% than snippet 2 🤯

Snippet 1:

Snippet 2:

Benchmarks

Snippet 1:

Snippet 2:

Really? a single line can make that difference? aren't they the same though 🤔

Notice in snippet 1:

On the other hand with snippet 2:

Top comments (0)

Read next

Is sourcehut git access denied for anyone?

WIP Notes working though Render hosting Flask + Vite + React + Wouter

Django, Flask, FastAPI, and More: Choosing the Right Python Framework for Your Project

Winter Solstice