Introduction: deque vs list
Have you ever had to append and pop elements from either side of a list
in python?
If you are working with a small dataset, you are probably going to be fine. However, if you are working with a large dataset, using list
will certainly hurt your perfomance. Here enters collections.deque.
We will introduce the deque
data structure then, we will compare their execution time with the counterpart list
operations.
What is a python deque?
According to python's documentation:
Deques are a generalization of stacks and queues (the name is pronounced “deck” and is short for “double-ended queue”). Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction.
In python, list
operations pop
from the end and append
will also have time complexity O(1). However, list
operations pop
from the start and insert
to the start will have time complexity of O(n), and there lies the big difference from deque
and list
.
For more information:
Let's check that concept by measuring and comparing the execution time of similar operations.
Time comparison of similar operations
We will use timeit to compute the execution time. Basically, timeit
executes a desired operation a specified number
of times and returns the elapsed time.
Don't focus on the absolute time values, but rather, focus on how the execution time of similar operations compare with each other.
We will setup our data structures list
as ls
and deque
as dq
and use them from now on. You should assume that whenever we are executing some operation, we are using the initial setup variables as below.
from timeit import timeit
from collections import deque
# timeit number of executions
tnumber = 100
N = 10000000
ls = list(range(N))
dq = deque(range(N))
Adding elements to the end: append
Let's append new elements.
timeit(lambda: ls.append(42), number=tnumber)
# 1.8183000065619126e-05 secs
timeit(lambda: dq.append(42), number=tnumber)
# 1.8543999431130942e-05 secs
Despite some running variation, in that case, the execution time is roughly the same.
Removing elements from the end: pop
Let's remove some elements from the end.
timeit(lambda: ls.pop(), number=tnumber)
# 3.296200065960875e-05 secs
timeit(lambda: dq.pop(), number=tnumber)
# 3.3829999665613286e-05 secs
Despite some running variation, in that case, the execution time is also roughly the same.
Adding elements to the start: insert vs appendleft
Now, we will see the difference between those data structures.
timeit(lambda: ls.insert(0, 42), number=tnumber)
# 1.6244264469996779 secs
timeit(lambda: dq.appendleft(42), number=tnumber)
# 2.668900015123654e-05
Notice that dq.appendleft(42)
is about 60k times faster than ls.insert(0, 42)
, that's more than enough to consider using deque
.
Removing elements from the start: pop vs popleft
Let's see the difference with respect to removing elements from the start.
timeit(lambda: ls.pop(0), number=tnumber)
# 1.1129129020000619 secs
timeit(lambda: dq.popleft(), number=tnumber)
# 2.6808000257005915e-05 secs
Notice that dq.popleft()
, in that case, is about 40k times faster than ls.pop(0)
.
Adding elements in the middle: insert
mid = int(N/2)
timeit(lambda: ls.insert(mid, 42), number=tnumber)
# 0.6198845629996868 secs
timeit(lambda: dq.insert(mid, 42), number=tnumber)
# 1.6952943249998498 secs
Notice that ls.insert(mid, 42)
is about 3 times faster than dq.insert(mid, 42)
. So, if you are considering adding elements into random positions in your data structure, maybe deque
might not be a better option. That difference may be due to the cost of managing the deque
doubly-linked data structure.
Disadvantages of deque
There will always be pros and cons. You should pay attention to your use cases as there will be scenarios in which one option is more appropriate than other.
For instance, you will not be able to quickly slice your deque
as you do with a list
.
dq[1000:1500]
Will result in:
-----------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-107-df775f3348c6> in <module>()
----> 1 dq[1000:1500]
TypeError: sequence index must be integer, not 'slice'
If you want to do that, you will have to use a different tool like itertools.islice.
import itertools
dq_slice = itertools.islice(dq, 1000, 1500)
Bear in mind that the returned object will have a special type itertools.islice
, so you will have to handle it properly.
Closing remarks
This material is intended for general use.
Let me know if something is unclear. Any feedback is more than welcome.
That's a wrap. Happy coding!
Top comments (0)