To understand the concept of ownership, it is necessary to have an idea of what the stack and the heap are and how they work.
The Stack and Heap
Variables are labels containing addresses to data in the memory. Each piece of data is stored either on the stack or the heap. The stack and heap are parts of the memory available to your program during runtime.
The stack size is known at compile time and it’s fixed. Therefore it can only store data with a fixed size and known at compile time, such as integers, float, boolean, etc.
// STACK VARIABLES
let x: i32 = 2;
let y: f64 = 3.14;
let z: bool = false;
The heap stores data whose size are unknown or can grow in size.
New elements can be pushed into the vector at some time in the program. So it’s stored on the heap.
let a: Vec<i32> = vec![2, 4, 6, 8];
The variable b
holds data that implements the display trait and this can be of any size, so we store it in the heap using the Box smart pointer.
use std::fmt::Display;
let b: Box<dyn Display> = Box::new(12);
Variable c
is a String data structure and we can add more characters to it during the program, therefore it is also stored on the heap.
let c: String = String::from("hello");
The stack stores values in the order it gets them and removes the values in the opposite order. At the start of this new scope in the program, the 2 is added to the top of the stack. When we get to the inner sope, 3.0, and 4 are pushed to the top of the stack, and x
and y
point to their respective values.
fn main() {
//OUTER SCOPE
// w is pushed to top the stack
let w = 2;
{
//INNER SCOPE
// x and y are pushed to the top of the stack
let x = 3.0;
let y = 4;
}
// y and x are popped off the stack and
// can not be used here
}
When we get to the end of the inner scope y
and x
are popped out of the stack. This is how all programming languages handle the addition and removal of data on the stack.
For adding data on the heap, the memory allocator finds a space big enough to hold the data, stores it, and returns a pointer which is the address of that memory.
Various programming languages have different approaches for removing data on the heap. High-level languages, like Javascript and Golang, have a garbage collector that periodically finds memory no longer in use and cleans them.
Low-level languages allow the user to explicitly allocate and free memory on the heap. In languages like C and C++, the programmer manually allocates memory which returns a pointer and deallocates the memory using that pointer.
If this is not done properly, it can lead to memory leaks when the programmer forgets to free the memory, causing the program to crash over time. Or trying to read from a freed memory, or freeing the same memory twice.
If you prefer a video version of this article, check out my youtube video on it.
Rust Ownership Model
Rust does things differently with its ownership model.
When we add to the heap, the pointer, which is the address of the memory is added to the stack since the size is known. It is 8 bytes for a 64-bit system and 4 bytes for a 32-bit system.
The variable then leads to that pointer in the stack and will be called the owner of the data.
When a heap variable goes out of scope, the pointer is popped off the stack, and whenever the pointer is popped off the stack, the data on the heap is cleared.
Rust’s ownership model ensures that there is only one pointer to a memory on the heap to prevent pointing to a freed memory. If there are two pointers to the same data on the heap, the pointers would be stored in the stack, and if one of the variables goes out of scope, the pointer will be popped off and the heap data will be cleared.
Trying to read the data from the other pointer will cause an error resulting in an unexpected behaviour or crashing the program.
When we assign a stack variable to a new variable the data is copied and pushed to the top of the stack. This process is cheap since their size and locations are already known. The process is called copying.
When we print the values of x
and y
, we get 3 for both.
let x = 3;
let y = x;
println!("x: {x}");
println!("y: {y}");
x: 3
y: 3
If we assign a heap variable to another variable, copying the data on the heap will be costly, since the memory allocator will have to first find a space on the heap, big enough to contain the data, before copying the values over to that address.
let x = String::from("hello");
let y = x;
On the stack, we can’t copy the pointer, since we will have two pointers to the same location on the heap. Instead, the pointer is moved to the top of the stack, the new variable then leads to that pointer, and the old variable becomes invalid.
The variable y
now holds the string’s pointer and when we try to read variable x
, we get an error.
println!("{x}");
error[E0382]: borrow of moved value: `x`
--> main.rs:16:13
|
12 | let x = String::from("hello");
| - move occurs because `x` has type `String`, wh
ich does not implement the `Copy` trait
13 |
14 | let y = x;
| - value moved here
15 |
16 | println!("{x}");
| ^^^ value borrowed here after move
Likewise, when we pass values to a function’s arguments the variables on the stack that aren’t pointers are copied to the parameters and pushed to the stack while the variables that are pointers are moved, and the function parameter is the new owner. The old variable is now invalid and when we try to make use of it, we get an error at compile time.
fn main() {
let s = vec![2, 4, 6, 8];
let t = 1;
// s is moved into the first parameter of the function
// while t is copied
get_elem(s, t);
//s is invalid here, but t is valid
}
fn get_elem(v: Vec<i32>, u: usize) -> i32 {
v[u]
}
When the scope is ended, the pointer is popped out and dropped.
We can transfer back ownership by returning the values and assigning them to the variables.
fn main() {
let s = vec![2, 4, 6, 8];
let t = 1;
let (result, s) = get_elem(s, t);
}
fn get_elem(v: Vec<i32>, u: usize) -> (i32, Vec<i32>) {
(v[u], v)
}
This approach is a bit tedious. If we want to keep ownership while using the values in a function, we make use of references.
References and Borrowing in Rust
References are special pointers that point to data and other pointers on the stack.
We modify the function to accept a reference of a vector, by adding an ampersand.
fn main() {
let y = vec![2, 4, 6, 8];
let x = 1;
let result = get_elem(&y, x);
}
fn get_elem(v: &Vec<i32>, u: usize) -> i32 {
v[u]
}
When we create a reference of a variable, a pointer is given to that variable and pushed to the stack. The compiler makes sure this reference will always point to valid data on the stack using a set of rules called the borrowing rules. The act of creating a reference to a variable is called borrowing.
Now when the function scope is ended, the reference is popped off the stack and not the actual value.
We can now use the variable y
after the function.
fn main() {
let y = vec![2, 4, 6, 8];
let x = 1;
let result = get_elem(&y, x);
println!("{y}");
}
fn get_elem(v: &Vec<i32>, u: usize) -> i32 {
v[u]
}
If a variable is moved all its references are invalidated. Here, the variable x
contains a string data, and y
references x
.
let x = "hello".to_string();
let y = &x;
If we move x
, into an inner scope and try to make use of y
after the scope, we get an error. This is because when x
is moved, its references are invalidated.
let x = "hello".to_string();
let y = &x;
{
x;
}
// Here, all references of x are invalid
// Since x is invalid
To learn more about references and the borrowing rules check out this article
Thanks for reading.
An article from my website https://cudi.dev/articles/ownership_in_rust_explained
Top comments (0)