This is the second of a three-part series which covers various aspects of Python's memory management. It started life as a conference talk I gave in 2021, titled 'Pointers? In My Python?' and the most recent recording of it can be found here.
Check out Part 1 of the series, or read on for an discussion of Object IDs in Python!
Object IDs, and why they matter
We ended Part 1 with the following question: how do we know when two Python objects are really the same object in memory? If we do b = deepcopy(a)
, how can we know for sure that it didn't just create a new pointer instead of a whole new object? The answer is Object IDs.
Python has a built-in id
function with the following properties:
-
id(x)
is an integer -
id(x) != id(y)
exactly whenx
andy
point at different objects in memory -
id(x)
is constant for the lifetime ofx
- that is, as long asx
remains in memory
There are many implementations of Python, and while the above three things must be true of the id
function in each of them, they don't all do it in the same way under the hood. Some implementations, such as CPython (the python interpreter written in C), use the object's memory address as its id
- but don't assume that all implementations will!
For example, Skulpt is the Python-to-JavaScript compiler which Anvil uses to run Python client code in the browser so you can develop for the web without having to write JavaScript; Skulpt's implementation of id
generates and caches a random number for every object in memory.
For the rest of this article, we'll be using examples generated using CPython, which equates an object's id
with its address in memory.
So, let's look at what happens when we check an object's id
!
>>> a = ["a", "list"]
>>> id(a)
139865338256192
>>> b = a
>>> id(a), id(b)
(139865338256192, 139865338256192)
Here we've defined a list a
, and created a new pointer to it by setting b = a
. When we check their id
s, we can see that they're the same - a
and b
point to the same thing.
>>> c = a.copy()
>>> id(a), id(c)
(139865338256192, 139865337919872)
Trying the same thing with c = a.copy()
shows that this creates a new list object; a
and c
have different id
s.
However, that isn't the only notion of 'sameness' that Python provides. Consider our familiar example, with a
pointing to a list object list, b
another pointer to that object, and c
a pointer to a copy of that object:
>>> a = ["my", "list"]
>>> b = a
>>> c = a.copy()
With this setup, we can do the following comparisons:
>>> a == b
True
>>> a is b
True
>>> a == c
True
>>> a is c
False
Once again: what is going on here? How can two things be the same and not the same? The answer is that is
and ==
are designed to serve two different purposes. is
is for when you want to know if two pointers are pointing at the exact same object in memory; ==
is for when you want to know if two objects should be considered to be equal.
is
uses id(x)
Saying a is b
is directly equivalent to saying id(a) == id(b)
. When you call is
on two objects, Python takes their id
s and directly compares them.
==
uses __eq__
When you write a == b
, you're actually calling a magic method, also known as a dunder method (named for the double-underscore on each side). You might be familiar with some magic methods already, such as:
-
__init__
, called when an instances of a Python class is initialised -
__str__
, called when you use thestr
built-in - e.g.str(some_object)
-
__repr__
, similar to__str__
but also called in other circumstances such as error messages
Magic methods are simply methods on a Python class, and the double underscores indicate that they interact with built-in Python methods. For example, overwriting the __str__
method on a Python class would change how the str
built-in behaved if you called it on an instance of that modified class.
When it comes to ==
and __eq__
, it's easiest to understand with some examples. Let's dive in!
class MyClass:
def __eq__(self, other):
return self is other
Here we've defined a custom class with its own __eq__
method. Every __eq__
method takes two arguments including self
- because whenever it's called, it'll be comparing two objects, including the instance of the class in question. In the above example, we've just set the method to fall through to the is
definition of equality (comparing the id
s of each object). As it happens, this is actually the default behaviour for any user-defined class in Python.
So, what happens if we define some non-default behaviour?
class MyAlwaysTrueClass:
def __init__(self, name):
self.name = name
def __eq__(self, other):
return True
Here we've defined a class which takes a name
argument (so we can keep track of our instances!) and has an __eq__
method which indiscriminately returns True
. This gives us the following behaviour:
>>> jane = MyAlwaysTrueClass("Jane")
>>> bob = MyAlwaysTrueClass("Bob")
>>> jane.name == bob.name
False
>>> jane == bob
True
Because we overrode the __eq__
method to always return True
, that means that all instances of this class will be considered equal under the ==
comparator - even when their names have different values!
Conversely, we can also do the following:
class MyAlwaysFalseClass:
def __init__(self, name):
self.name = name
def __eq__(self, other):
return False
You might think this is more sensible, but consider:
>>> a = MyAlwaysFalseClass("name")
>>> a == a
False
Moreover, because the behaviour of __eq__
is dependent on which object is self
and which is other
, we can get the following:
>>> jane = MyAlwaysTrueClass("Jane")
>>> bob = MyAlwaysFalseClass("Bob")
>>> jane == bob
True
>>> bob == jane
False
In summary: magic methods are a fun way to make Python do things that seem very strange.
Earlier, we mentioned that id(x)
is constant and unique 'for the lifetime of x
', which is equivalent to saying 'as long as x
remains in memory'. That raises the question: once we've created an object x
, how does its 'lifetime' end? Hold on 'til Part 3 of this series, where we'll get the answer!
More about Anvil
If you're new here, welcome! Anvil is a platform for building full-stack web apps with nothing but Python. No need to wrestle with JS, HTML, CSS, Python, SQL and all their frameworks – just build it all in Python.
Top comments (0)