loganwohlers

Posted on

# Memory, Reference Semantics, and Life in Our Solar System

My goal with this post is to provide a high level overview of one of the more intimidating concepts for people new to coding- reference semantics. When I was first learning how to code- I found this topic to be very difficult and I didn't even bother with trying to understand how memory in a computer worked. Recently I've put in some effort to understand what is happening under the hood and why things work they way they do. This should (hopefully) serve as an easy introduction to some of these concepts. I fell into the trap of thinking of these as non-practical material that you might only see in an interview or in a classroom setting but I've found that even a basic understanding of these concepts has translated well across a multiple languages (as well as helping out with more advanced data structures such as linked lists/binary trees) and has made my life much easier.

Lets start with a brief attempt to demystify reference semantics- the topic of pass by reference vs pass by value. Ultimately this idea covers how variables are stored and modified in a project and aren't too complicated- but are often presented in an overly confusing way. In the most simple terms possible memory in a computer is like a giant "array". Each "element" holds one byte of data and has a memory address (think index in an array) that starts from 0 and works its way up. These addresses are referred to in hexadecimal which is why if they look like 0x11cee8 as opposed to the normal numbers we know and love. So with this in mind we have to take a look at what each of these elements can actually store. With just one byte of storage space- each element can only store a simple primitive value (characters, numbers, boolean values, etc). Different languages define their own primitive values but they mostly overlap. Multiple elements of memory in a row are how we get more advanced data structures like arrays and objects- this fundamental difference between primitives and more complex data structures is the key behind reference semantics.

PASS BY VALUE

let x=5;
let y=x;

function changeX(x) {
x=x+1;
return x;
}

x=changeX(x) //x is now 6
console.log(x) //prints 6
console.log(y) //still 5?

When we declare a variable in our program- a chunk of memory is carved out somewhere for that value to be stored. However all primitive values are ALREADY stored in memory somewhere- these are values our computer knows about and that don't change (ie we aren't creating a new letter in the alphabet but an object or an array can store anything). So when we create a variable and set it's value to a primitive- our computer says- "oh I know this value and where it is stored- here's a copy of it to use for your variable" the key here is that the variable contains a COPY of that value.

in the example above:
-we carve out space in memory for variable 'x'- it gets passed a COPY of the int 5

-we set y equal to x. it ALSO gets passed a COPY of the int 5

-we change the value of x to 6- but it doesn't affect y. They both started with copies of the number 5 and have no relation to each other- so changing x had NO EFFECT on y.

Now lets try the same thing with an object..

PASS BY REFERENCE

let person1={name: 'logan', age: 24}
let person2=person;

function agePerson(person){
person.age=100;
}

console.log(person) // { name: 'logan', age: 24 }

agePerson(person) //person's age is now 100

console.log(person) // { name: 'logan', age: 100 }

console.log(person2) //also changed? { name: 'logan', age: 100 }

This time the person object DID change-- but this code is the exact same as the above example- so what's happening here? When a variable is declared for an object/data structure the computer says something along the lines of-"I have no idea what you're going to put in here but here's some space and the actual memory address so that you can interact with it again later"- it passes us a REFERENCE to a variable's actual location in memory.

-Both the person1 and person2 variables simply contain a REFERENCE to the underlying objects memory address- they DON'T STORE SEPARATE COPIES OF IT'S VALUE.

-This reference is just the hexadecimal address where the object is stored in memory. So when we change person1 via the age function- the underlying object (what person2 ALSO points at) IS changed- thus the change is reflected in both variables.

When you pass complex data structures as variables- changes made will affect the actual underlying object since we have a reference to it's underlying memory address. This is why we often make copies/have to be careful when modifying these to assure that we don't destroy/unintentionally modify their values. With primitives- no such caution needs to be taken. Reference semantics are responsible for all of it. Hopefully someone out there found this helpful. Please let me know if any questions/comments!