DEV Community

Prashant Sharma
Prashant Sharma

Posted on • Originally published at sharmapacific.in

SpeedUp Python List and Dictionary

Today, we will be discussing the optimization technique in Python. In this article, you will get to know to speed up your code by avoiding the re-evaluation inside a list and dictionary.

Here I have written the decorator function to calculate the execution time of a function.

import functools
import time

def timeit(func):
    @functools.wraps(func)
    def newfunc(*args, **kwargs):
        startTime = time.time()
        func(*args, **kwargs)
        elapsedTime = time.time() - startTime
        print('function - {}, took {} ms to complete'.format(func.__name__, int(elapsedTime * 1000)))
    return newfunc

let's move to the actual function

Avoid Re-evaluation in Lists

Evaluating nums.append inside the loop

@timeit
def append_inside_loop(limit):
    nums = []
    for num in limit:
        nums.append(num)

append_inside_loop(list(range(1, 9999999)))

In the above function nums.append function references that are re-evaluated each time through the loop. After execution, The total time taken by the above function

o/p - function - append_inside_loop, took 529 ms to complete

Evaluating nums.append outside the loop

@timeit
def append_outside_loop(limit):
    nums = []
    append = nums.append
    for num in limit:
        append(num)

append_outside_loop(list(range(1, 9999999)))

In the above function, I evaluate nums.append outside the loop and used append inside the loop as a variable. Total time is taken by the above function

o/p - function - append_outside_loop, took 328 ms to complete

As you can see when I have evaluated the append = nums.append outside the for loop as a local variable, it took less time and speed-up the code by 201 ms.

The same technique we can apply to the dictionary case also, look at the below example

Avoid Re-evaluation in Dictionary

Evaluating data.get each time inside the loop

@timeit
def inside_evaluation(limit):
    data = {}
    for num in limit:
        data[num] = data.get(num, 0) + 1

inside_evaluation(list(range(1, 9999999)))

Total Time taken by the above function -

o/p - function - inside_evaluation, took 1400 ms to complete

Evaluating data.get outside the loop

@timeit
def outside_evaluation(limit):
    data = {}
    get = data.get
    for num in limit:
        data[num] = get(num, 0) + 1


outside_evaluation(list(range(1, 9999999)))

Total time taken by the above function -

o/p - function - outside_evaluation, took 1189 ms to complete

As you can see we have speed-up the code here by 211 ms.

I hope you like the explanation of the optimization technique in Python for the list and dictionary. Still, if any doubt or improvement regarding it, ask in the comment section. Also, don't forget to share your optimization technique.

Top comments (11)

Collapse
 
sepyrt profile image
Dimitris R

Wow thanks! I had no idea this was even possible.

For the list append, is it because of the use of len()? Just checked the code because I wanted to know why this is happening. Couldn't figure out the dictionary though.

Collapse
 
dwd profile image
Dave Cridland

Both are actually the same effect. When you call data.get(...), you're calling the method, of course, but also looking up the method in the data object's internal dictionary, called __dict__. What the above article is showing is that there's some savings to be made if you cache the lookup.

He's not optimizing a list at all, but he is optimizing a dict - in both cases.

Because data[num] = ... is actually data.set(num, ...), there's another candidate for optimization there, as well - but by now you can probably guess what it is.

Collapse
 
dzhamal265 profile image
panda

It's was fixed in python3.8

Collapse
 
rafaacioly profile image
Rafael Acioly

so v3.8 is faster? (like the article says)

Collapse
 
milan997 profile image
milan997

Do you have some source on this? How it was fixed?

Collapse
 
idanatias profile image
Idan Atias • Edited

Thanks for sharing.
I wonder though if this reduces code readability.
If we are not talking about mission critical paths, then it may not worth the 10-15 % reduction.

Collapse
 
dmerejkowsky profile image
Dimitri Merejkowsky

I wonder though if this reduces code readability.

Yes it does - idiomatic Python code is typically not written this way.

If we are not talking about mission critical paths, then it may not worth the 10-15 % reduction.

This. Before applying this technique, make sure to measure the performance of your entire application!

Finally, a better advice for beginners (in my opinion), is to advise them to use comprehensions when they can:

new_list = [append(num) for num in list]

is more idiomatic and even faster than the technique presented here.

Collapse
 
sharmapacific profile image
Prashant Sharma

@Dimitri Did you tried this -

Kindly Look at the result below -

@timeit
def append_outside_loop(limit):
    nums = []
    append = nums.append
    for num in limit:
        append(num)

append_outside_loop(list(range(1, 9999999)))
o/p - function - append_outside_loop, took 445 ms to complete

and as you said -

@timeit
def append_outside_loop(limit):
    nums = []
    append = nums.append
    [append(num) for num in limit]

append_outside_loop(list(range(1, 9999999)))
o/p - function - append_outside_loop, took 602 ms to complete
Thread Thread
 
dmerejkowsky profile image
Dimitri Merejkowsky

Excellent! You took the time to verify what I was saying without giving proof - and you we right to do so!

The problem is that the last line in the second function is actually building a list.

Here's a better example:

def is_even(x):
    return x % 2 == 0

@timeit
def func1(my_list):
   """ calling append() in a loop """
    my_other_list = []
    for x in my_list:
        if is_even(x):
            my_other_list.append(x)


func1(list(range(1, 9999999)))

@timeit
def func2(my_list):
    """Saving one 'op' by not looking up the is_even 
   function every time
    """
    my_other_list = []
    func = is_even
    for x in my_list:
        if func(x):
            my_other_list.append(x)


func2(list(range(1, 9999999)))

@timeit
def func3(my_list):
    """Using list comprehensions
    """
    my_other_list = [x for x in my_list if is_even(x)]

func3(list(range(1, 9999999)))
Collapse
 
jcsvveiga profile image
João Veiga

Great article!
Check out the dis module and disassemble the code and it explains why this happens. TLDR: it's one less instruction inside the hot loop.

Collapse
 
milan997 profile image
milan997

This makes me thing of how slow really Python must be if the evaluation takes so long... Anyone got more info that, how much that affects performance and why?