DEV Community

Cover image for Mutex in Rust can be tricky sometimes...
Yassin Eldeeb ๐Ÿฆ€
Yassin Eldeeb ๐Ÿฆ€

Posted on

Mutex in Rust can be tricky sometimes...

Can you guess why snippet 1 is outperforming by 400% than snippet 2 ๐Ÿคฏ

So, this is a thread pool implementation for a simple HTTP server using an mpsc channel to add incoming requests' handlers to a queue then threads can receive the sent closures (request handlers) and execute them in parallel.

Snippet 1:

type Job = Box<dyn FnOnce() + Send + 'static>;

struct Worker {
    id: usize,
    thread: JoinHandle<()>,
}

impl Worker {
    fn new(id: usize, receiver: Arc<Mutex<Receiver<Job>>>) -> Worker {
        let thread = thread::spawn(move || loop {
            // here ๐Ÿ‘‡
            let job = receiver.lock().unwrap().recv().unwrap();
            // here ๐Ÿ‘†
            println!("executing on thread: {}", id);

            job();
        });

        Worker { id, thread }
    }
}
Enter fullscreen mode Exit fullscreen mode

Snippet 2:

type Job = Box<dyn FnOnce() + Send + 'static>;

struct Worker {
    id: usize,
    thread: JoinHandle<()>,
}

impl Worker {
    fn new(id: usize, receiver: Arc<Mutex<Receiver<Job>>>) -> Worker {
        let thread = thread::spawn(move || loop {
            // here ๐Ÿ‘‡
            let lock = receiver.lock().unwrap();
            let job = lock.recv().unwrap();
            // here ๐Ÿ‘†
            println!("executing on thread: {}", id);

            job();
        });

        Worker { id, thread }
    }
}
Enter fullscreen mode Exit fullscreen mode

Benchmarks

Tested using Artillery on a CPU that has 16 logical processors and checked the graphs to confirm that all threads are indeed running in parallel and not concurrently.

Look at the spikes ๐Ÿ‘‡
CPU

Snippet 1:

Snippet 1

Snippet 2:

Snippet 2

Really? a single line can make that difference? aren't they the same though ๐Ÿค”

Let's discuss a few things first to understand where is the problem ๐Ÿ‘จโ€๐Ÿซ

mpsc from its name is "multiple producers, single consumer", in this case, we want a single producer(main thread) to add requests' handlers to the queue and we want multiple consumers so that multiple threads can consume messages in parallel.

So to achieve our goal we wrapped the receiver we got from mpsc in an Arc which gives us multiple ownership for the receiver to pass to the different threads to work with and to make it that only one thread is responsible for executing a handler we used Mutex which allows only one thread to access the receiver at any given time, because we don't want two threads to respond for a given request.

To access the data in a mutex, a thread must first signal that it wants access by asking to acquire the mutexโ€™s lock. This lock keeps track of who currently has exclusive access to the data and the lock is released when the MutexGuard is dropped.

Notice in snippet 1:

// -------โฌ‡โฌ‡โฌ‡โฌ‡Temporary Valueโฌ‡โฌ‡โฌ‡โฌ‡
let job = receiver.lock().unwrap().recv().unwrap();
Enter fullscreen mode Exit fullscreen mode

โœจthe trick here is: using let, any temporary values used in the expression on the right-hand side of the equals sign are immediately dropped when the let statement ends!

So that means that the lock is released before the thread even starts executing the handler, which means that other threads can acquire the lock to receive incoming requests' handlers to process them ๐Ÿ’ช

As you see here we're running 16 threads in parallel which is freaking awesome!

executing on thread: 0
executing on thread: 1
executing on thread: 2
executing on thread: 5
executing on thread: 4
executing on thread: 3
executing on thread: 10
executing on thread: 7
executing on thread: 13
executing on thread: 9
executing on thread: 6
executing on thread: 11
executing on thread: 12
Enter fullscreen mode Exit fullscreen mode

On the other hand with snippet 2:

    let thread = thread::spawn(move || loop {
       let acquired_lock= receiver.lock().unwrap();
       let job = acquired_lock.recv().unwrap();

       println!("executing on thread: {}", id);

       job();
    } // `acquired_lock` is dropped here and lock is released.
 );
Enter fullscreen mode Exit fullscreen mode

the lock is released after the execution of the handler, specifically at the end of the scope where It was defined.

Therefore the threads, in this case, are running concurrently and not in parallel, cause each thread has to wait for the other thread who acquired the lock to release it so that it can read the sent closures from the receiver.

And actually the current lock acquirer probably will pick it up again faster than other threads cause it's like:

  1. acquire lock.
  2. release lock.
  3. next iteration of the loop.
  4. acquire lock.
  5. ...

which finally leads to single-threaded execution even if there are threads spawned and waiting in the pool, As shown below ๐Ÿ‘‡

executing on thread: 0
executing on thread: 0
executing on thread: 0
executing on thread: 0
executing on thread: 0
executing on thread: 0
executing on thread: 0
executing on thread: 0
executing on thread: 0
executing on thread: 0
Enter fullscreen mode Exit fullscreen mode

Very tricky right? ๐Ÿ˜…

The lesson here is, to be careful when dealing with Mutex cause the compiler can't guarantee the desired threading behavior you wanna achieve such as this scenario or other scenarios like deadlocks which are so much worse.

That's why I've included a picture of Ferris that means that "This code compiles but does not produce the desired behavior." from the rust lang book

Without the logs, I wouldn't even notice that there was a problem ๐Ÿคทโ€โ™‚๏ธ

Hope you enjoyed it ๐Ÿค—

Top comments (0)