Nithin Bharadwaj

Posted on Mar 18, 2025

Rust's Ownership System: Memory-Safe Concurrent Programming Without Performance Sacrifice

#programming #devto #rust #softwareengineering

As a best-selling author, I invite you to explore my books on Amazon. Don't forget to follow me on Medium and show your support. Thank you! Your support means the world!

Rust has revolutionized concurrent programming with its unique approach to shared mutability. When I first encountered Rust's ownership system, I was skeptical about whether it could deliver on its promise of memory safety without sacrificing performance. After building several concurrent systems, I'm convinced it offers the best balance available today.

The challenge of concurrent programming has always centered on safely sharing mutable state between threads. Traditional approaches either restrict mutability (functional programming) or control access through locks (imperative programming). Rust takes a hybrid approach that gives developers fine-grained control over how data is shared.

The Foundation: Memory Safety in Concurrent Contexts

Rust's type system prevents data races at compile time through its ownership rules. A data race occurs when two threads access the same memory location simultaneously, with at least one performing a write operation, without proper synchronization.

The compiler enforces rules that make concurrent bugs rare:

Each value has a single owner
References to that value must follow borrowing rules
References are either exclusive (&mut) or shared (&), but never both simultaneously

This system catches a whole class of bugs during compilation that would cause runtime errors in other languages.

Atomics: Building Blocks for Lock-Free Programming

For high-performance concurrent code, Rust provides atomic types in the standard library. These types implement thread-safe operations that don't require locks.

use std::sync::atomic::{AtomicUsize, Ordering};
use std::thread;

fn main() {
    let counter = AtomicUsize::new(0);
    let counter_ref = &counter;

    let handles: Vec<_> = (0..10)
        .map(|_| {
            thread::spawn(move || {
                for _ in 0..1000 {
                    counter_ref.fetch_add(1, Ordering::SeqCst);
                }
            })
        })
        .collect();

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Final count: {}", counter.load(Ordering::SeqCst));
}

The Ordering parameter specifies memory ordering constraints, which control how operations are ordered relative to other threads. This gives developers precise control over performance trade-offs.

Rust provides several atomic types:

AtomicBool
AtomicI8/16/32/64/usize
AtomicU8/16/32/64/usize
AtomicPtr<T>

While powerful, atomic operations require careful thought about memory ordering to ensure correctness.

Interior Mutability: The Secret to Shared Mutable State

Rust's interior mutability pattern allows mutable access to data even through shared references. This is the foundation for building thread-safe data structures.

The most common types implementing interior mutability are:

Mutex<T> - provides exclusive access to data
RwLock<T> - allows multiple readers or a single writer
RefCell<T> - for single-threaded scenarios
Cell<T> - for Copy types in single-threaded contexts

Here's how to use a mutex to protect shared data:

use std::sync::{Arc, Mutex};
use std::thread;

fn main() {
    let data = Arc::new(Mutex::new(vec![1, 2, 3]));

    let mut handles = vec![];

    for i in 0..3 {
        let data_clone = Arc::clone(&data);
        handles.push(thread::spawn(move || {
            let mut data = data_clone.lock().unwrap();
            data.push(i + 4);
        }));
    }

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Final data: {:?}", *data.lock().unwrap());
}

The Arc (Atomic Reference Counting) type enables multiple threads to own the same data, while Mutex ensures only one thread can access it at a time.

Building a Thread-Safe Counter

Let's implement a simple counter that can be safely incremented from multiple threads:

use std::sync::{Arc, Mutex};
use std::thread;

struct ThreadSafeCounter {
    count: Mutex<u64>,
}

impl ThreadSafeCounter {
    fn new() -> Self {
        ThreadSafeCounter {
            count: Mutex::new(0),
        }
    }

    fn increment(&self) {
        let mut count = self.count.lock().unwrap();
        *count += 1;
    }

    fn value(&self) -> u64 {
        *self.count.lock().unwrap()
    }
}

fn main() {
    let counter = Arc::new(ThreadSafeCounter::new());
    let mut handles = vec![];

    for _ in 0..10 {
        let counter_clone = Arc::clone(&counter);
        handles.push(thread::spawn(move || {
            for _ in 0..1000 {
                counter_clone.increment();
            }
        }));
    }

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Counter value: {}", counter.value());
}

This pattern works well for simple data structures. The Mutex ensures that only one thread can modify the counter at a time, preventing race conditions.

Advanced Synchronization with RwLock

When your workload has many more reads than writes, RwLock can improve performance by allowing multiple readers simultaneously:

use std::sync::{Arc, RwLock};
use std::thread;

struct Database {
    data: RwLock<Vec<String>>,
}

impl Database {
    fn new() -> Self {
        Database {
            data: RwLock::new(Vec::new()),
        }
    }

    fn add_entry(&self, entry: String) {
        let mut data = self.data.write().unwrap();
        data.push(entry);
    }

    fn entries(&self) -> Vec<String> {
        let data = self.data.read().unwrap();
        data.clone()
    }
}

fn main() {
    let db = Arc::new(Database::new());
    let mut handles = vec![];

    // Writer threads
    for i in 0..5 {
        let db_clone = Arc::clone(&db);
        handles.push(thread::spawn(move || {
            db_clone.add_entry(format!("Entry {}", i));
        }));
    }

    // Reader threads
    for _ in 0..20 {
        let db_clone = Arc::clone(&db);
        handles.push(thread::spawn(move || {
            let entries = db_clone.entries();
            println!("Read {} entries", entries.len());
        }));
    }

    for handle in handles {
        handle.join().unwrap();
    }
}

The RwLock allows multiple reader threads to access the data simultaneously, which can significantly improve performance in read-heavy workloads.

Building a Concurrent HashMap

Now let's build something more complex - a thread-safe hash map implementation:

use std::collections::HashMap;
use std::sync::{Arc, RwLock};
use std::hash::Hash;
use std::thread;

struct ConcurrentHashMap<K, V> {
    shards: Vec<RwLock<HashMap<K, V>>>,
}

impl<K, V> ConcurrentHashMap<K, V> 
where 
    K: Eq + Hash + Clone,
    V: Clone,
{
    fn new(shard_count: usize) -> Self {
        let mut shards = Vec::with_capacity(shard_count);
        for _ in 0..shard_count {
            shards.push(RwLock::new(HashMap::new()));
        }
        ConcurrentHashMap { shards }
    }

    fn get_shard(&self, key: &K) -> &RwLock<HashMap<K, V>> {
        let shard_index = (std::hash::Hash::hash(key) as usize) % self.shards.len();
        &self.shards[shard_index]
    }

    fn insert(&self, key: K, value: V) {
        let shard = self.get_shard(&key);
        let mut map = shard.write().unwrap();
        map.insert(key, value);
    }

    fn get(&self, key: &K) -> Option<V> {
        let shard = self.get_shard(key);
        let map = shard.read().unwrap();
        map.get(key).cloned()
    }
}

fn main() {
    let map = Arc::new(ConcurrentHashMap::<String, u64>::new(16));
    let mut handles = vec![];

    // Writer threads
    for i in 0..100 {
        let map_clone = Arc::clone(&map);
        handles.push(thread::spawn(move || {
            map_clone.insert(format!("key{}", i), i as u64);
        }));
    }

    // Reader threads
    for i in 0..100 {
        let map_clone = Arc::clone(&map);
        handles.push(thread::spawn(move || {
            if let Some(value) = map_clone.get(&format!("key{}", i)) {
                println!("Read key{}: {}", i, value);
            }
        }));
    }

    for handle in handles {
        handle.join().unwrap();
    }
}

This implementation uses a technique called "sharding" to reduce contention. By dividing the map into multiple independent segments, each protected by its own RwLock, we allow operations on different shards to proceed in parallel.

Lock-Free Data Structures with Crossbeam

For maximum performance, lock-free data structures avoid locks entirely. The crossbeam crate provides several high-quality implementations:

use crossbeam::queue::ArrayQueue;
use std::sync::Arc;
use std::thread;

fn main() {
    let queue = Arc::new(ArrayQueue::new(100));
    let mut handles = vec![];

    // Producer threads
    for i in 0..10 {
        let queue = Arc::clone(&queue);
        handles.push(thread::spawn(move || {
            for j in 0..10 {
                let value = i * 100 + j;
                if queue.push(value).is_ok() {
                    println!("Pushed {}", value);
                } else {
                    println!("Queue full, couldn't push {}", value);
                }
            }
        }));
    }

    // Consumer threads
    for _ in 0..5 {
        let queue = Arc::clone(&queue);
        handles.push(thread::spawn(move || {
            while let Some(value) = queue.pop() {
                println!("Popped {}", value);
            }
        }));
    }

    for handle in handles {
        handle.join().unwrap();
    }
}

The ArrayQueue is a bounded lock-free queue. It uses atomic operations to ensure thread safety without locks, which can significantly improve performance in high-contention scenarios.

Implementing a Lock-Free Stack

Let's create our own lock-free data structure - a simple stack using atomic operations:

use std::sync::atomic::{AtomicPtr, Ordering};
use std::ptr;
use std::thread;
use std::sync::Arc;

struct Node<T> {
    data: T,
    next: *mut Node<T>,
}

struct LockFreeStack<T> {
    head: AtomicPtr<Node<T>>,
}

impl<T> LockFreeStack<T> {
    fn new() -> Self {
        LockFreeStack {
            head: AtomicPtr::new(ptr::null_mut()),
        }
    }

    fn push(&self, data: T) {
        let new_node = Box::into_raw(Box::new(Node {
            data,
            next: ptr::null_mut(),
        }));

        loop {
            let current_head = self.head.load(Ordering::Relaxed);
            unsafe {
                (*new_node).next = current_head;
            }

            if self.head.compare_exchange(
                current_head,
                new_node,
                Ordering::Release,
                Ordering::Relaxed
            ).is_ok() {
                break;
            }
        }
    }

    fn pop(&self) -> Option<T> {
        loop {
            let current_head = self.head.load(Ordering::Acquire);
            if current_head.is_null() {
                return None;
            }

            let next;
            unsafe {
                next = (*current_head).next;
            }

            if self.head.compare_exchange(
                current_head,
                next,
                Ordering::Release,
                Ordering::Relaxed
            ).is_ok() {
                let result;
                unsafe {
                    let node = Box::from_raw(current_head);
                    result = Some(node.data);
                }
                return result;
            }
        }
    }
}

unsafe impl<T: Send> Send for LockFreeStack<T> {}
unsafe impl<T: Sync> Sync for LockFreeStack<T> {}

fn main() {
    let stack = Arc::new(LockFreeStack::<i32>::new());
    let mut handles = vec![];

    // Push threads
    for i in 0..5 {
        let stack = Arc::clone(&stack);
        handles.push(thread::spawn(move || {
            for j in 0..10 {
                stack.push(i * 100 + j);
                println!("Pushed {}", i * 100 + j);
            }
        }));
    }

    // Pop threads
    for _ in 0..2 {
        let stack = Arc::clone(&stack);
        handles.push(thread::spawn(move || {
            while let Some(value) = stack.pop() {
                println!("Popped {}", value);
            }
        }));
    }

    for handle in handles {
        handle.join().unwrap();
    }
}

This implementation uses atomic operations (AtomicPtr and compare_exchange) to update the stack safely without locks. The compare-and-swap pattern is fundamental to lock-free programming.

Note that this implementation is simplified and has a memory leak in a real concurrent setting due to the ABA problem. Production code would use techniques like hazard pointers or memory reclamation schemes.

Lazy Initialization with Once

Sometimes you need to initialize data once across multiple threads. The std::sync::Once type handles this pattern efficiently:

use std::sync::{Arc, Once};
use std::thread;

struct ExpensiveResource {
    data: Vec<u64>,
}

impl ExpensiveResource {
    fn new() -> Self {
        println!("Creating expensive resource...");
        let mut data = Vec::with_capacity(1000);
        for i in 0..1000 {
            data.push(i);
        }
        ExpensiveResource { data }
    }

    fn get_data(&self) -> &[u64] {
        &self.data
    }
}

fn main() {
    let resource = Arc::new(None);
    let once = Arc::new(Once::new());
    let mut handles = vec![];

    for _ in 0..10 {
        let resource_clone = Arc::clone(&resource);
        let once_clone = Arc::clone(&once);

        handles.push(thread::spawn(move || {
            once_clone.call_once(|| {
                // This will only run once across all threads
                let resource = Box::new(ExpensiveResource::new());
                unsafe {
                    // This is safe because we're only doing this once
                    *(resource_clone.as_ptr() as *mut Option<ExpensiveResource>) = Some(*resource);
                }
            });

            // Wait for initialization
            while unsafe { (*resource_clone.as_ptr()).is_none() } {
                thread::yield_now();
            }

            // Now we can safely access the data
            let data = unsafe {
                &(*resource_clone.as_ptr()).as_ref().unwrap().get_data()
            };

            println!("Thread got data, sum: {}", data.iter().sum::<u64>());
        }));
    }

    for handle in handles {
        handle.join().unwrap();
    }
}

Although this example is complex due to working with Option inside an Arc, the Once type itself is straightforward. It ensures that the initialization closure runs exactly once, even when called from multiple threads.

Thread-Local Storage

Sometimes the best way to handle shared state is to avoid sharing it entirely. Thread-local storage allows each thread to have its own private instance:

use std::cell::RefCell;
use std::thread;
use std::thread_local;

thread_local! {
    static COUNTER: RefCell<u64> = RefCell::new(0);
}

fn increment_counter() {
    COUNTER.with(|counter| {
        *counter.borrow_mut() += 1;
    });
}

fn get_counter() -> u64 {
    COUNTER.with(|counter| *counter.borrow())
}

fn main() {
    let mut handles = vec![];

    for _ in 0..5 {
        handles.push(thread::spawn(move || {
            for _ in 0..10 {
                increment_counter();
                println!("Thread {:?} counter: {}", thread::current().id(), get_counter());
            }
        }));
    }

    for handle in handles {
        handle.join().unwrap();
    }

    // Main thread's counter is separate
    println!("Main thread counter: {}", get_counter());
}

Each thread gets its own independent copy of COUNTER, eliminating the need for synchronization. This approach is especially useful for caches or accumulators that don't need to be shared.

Parking and Unparking Threads

For advanced synchronization scenarios, Rust provides low-level thread parking:

use std::thread;
use std::time::Duration;
use std::sync::{Arc, Mutex};

fn main() {
    let pair = Arc::new((Mutex::new(false), Mutex::new(0)));
    let pair2 = Arc::clone(&pair);

    let handle = thread::spawn(move || {
        let (lock, counter) = &*pair2;

        println!("Thread waiting to be notified...");

        // Wait until notified
        while !*lock.lock().unwrap() {
            thread::park();
        }

        println!("Thread woke up, incrementing counter");
        let mut counter = counter.lock().unwrap();
        *counter += 1;
    });

    // Give the thread time to park
    thread::sleep(Duration::from_millis(100));

    // Signal thread to wake up
    let (lock, _) = &*pair;
    *lock.lock().unwrap() = true;

    // Wake up the parked thread
    handle.thread().unpark();

    handle.join().unwrap();

    let (_, counter) = &*pair;
    println!("Final counter value: {}", *counter.lock().unwrap());
}

Parking temporarily suspends a thread until it's explicitly unparked. This provides a building block for creating more complex synchronization primitives.

Conclusion: The Power of Safe Concurrency

Rust's approach to shared mutability has fundamentally changed how I think about concurrent programming. By combining compile-time guarantees with flexible runtime mechanisms, Rust enables building concurrent systems that are both safe and performant.

The key insight is that most concurrency bugs come from uncontrolled access to shared mutable state. By providing tools to carefully manage that access, Rust eliminates whole categories of bugs without sacrificing performance.

From my experience building production systems, Rust's concurrency model leads to code that's not only safer but often more efficient than equivalent C++ implementations. The clarity around ownership and access patterns makes it easier to reason about performance characteristics.

Whether you're building a high-throughput web server, a data processing pipeline, or a real-time system, Rust's approach to shared mutability provides the tools you need to build concurrent code with confidence.

101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!

Our Creations

Be sure to check out our creations:

We are on Medium

DEV Community