While I was teaching a class about iterators and generators, a student asked me a question that didn't seem so interesting at first. "What is faster - xrange or range?". I tried to explain to him that in many cases it's better to use xrange instead of range - but because of the lower memory usage, and not because of run time. He continued: "Yes, I get it. But which one is faster?".
Well, it's pretty obvious that building a list with range takes more time than calling xrange - since when calling xrange, no real list is built. So to phrase it more precisely, the question is:
What is faster - calling next of an iterator over a list object, or calling next of an iterator of an xrange object?
I had to think about it for a bit. Before looking at the documentation or source code, I tried to think what is my intuition about it.
Both operations suppose to be pretty fast, compared to other basic operations (like building a list with range).
- Getting the next object in a list is basically accessing an array in some specific index (and increasing that index)
- Getting the next object of a xrange object is basically increasing some number, and returning it. Again, a very simple operation.
So what is faster? Direct memory access or a simple addition?
I thought that maybe the addition (xrange) would be a bit faster, but not by much.
What does the documentation of xrange have to say about it?
Well, according to Docstring of xrange, it is slightly faster than range:
In : xrange? Docstring: xrange(stop) -> xrange object xrange(start, stop[, step]) -> xrange object Like range(), but instead of returning a list, returns an object that generates the numbers in the range on demand. For looping, this is slightly faster than range() and more memory efficient.
But the Docstring doesn't say exactly what is faster about range. So we have to guess that they are addressing calling the next method (iterating over the xrange)
But, the "formal" documentation of python, doesn't say anything explicitly about the speed of xrange:
xrange(start, stop[, step])
This function is very similar to range(), but returns an xrange object instead of a list. This is an opaque sequence type which yields the same values as the corresponding list, without actually storing them all simultaneously. The advantage of xrange() over range() is minimal (since xrange() still has to create the values when asked for them) except when a very large range is used on a memory-starved machine or when all of the range’s elements are never used (such as when the loop is usually terminated with break). For more information on xrange objects, see XRange Type and Sequence Types — str, unicode, list, tuple, bytearray, buffer, xrange.
You can see that documentation here - https://docs.python.org/2/library/functions.html#xrange
One of the many things I like about Python is that it is really easy to try things out. If you want to know what is faster, just run the code, and see what is faster!
We want to eliminate the time building the list with range and focus just on the iteration itself.
ipython has a very nice feature for checking the run time of methods - %timeit. We can use it with next of list iterator / xrange iterator and see which one is faster.
We will use a list /xrange of size 10,000,000 for the test. We will also need to limit the number of tests that %timeit performs, so we won't hit StopIteration exception (the end of the list / xrange).
Here are the results:
In : list_iterator = iter(range(10 ** 7)) In : %timeit -n 1000000 list_iterator.next() 1000000 loops, best of 3: 68.7 ns per loop In : xrange_iterator = iter(xrange(10 ** 7)) In : %timeit -n 1000000 xrange_iterator.next() 1000000 loops, best of 3: 65.9 ns per loop
Seems a bit faster, right? By about 5%
But the results are not so consistent:
In : xrange_iterator = iter(xrange(10 ** 7)) In : %timeit -n 1000000 xrange_iterator.next() 1000000 loops, best of 3: 71.6 ns per loop
Sometimes the xrange is slower!
The source of differences is probably my running environment (a pretty old Ubuntu 14.04 VM, with just 1GB of memory and a single CPU).
The important part is that it doesn't seem like there is a big different between the speed of xrange and range in Python 2.7.
Well, in Python 3 xranges were removed completely from the language. Now range objects behave like the old and beloved xranges, and they return an object which you can iterate on - but which is not a list.
So you would think that the new range objects will be faster than the old xrange objects, right?
Let's check it out.
Remember that .next was also removed from Python 3, and now we need to call next(iterator) instead of iterator.next().
In : range_iterator = iter(range(10 ** 7)) In : %timeit -n 1000000 next(range_iterator) 111 ns ± 7.63 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
That's above 110 ns for an operation, instead of ~70 ns in Python 2!
And as with Python 2, it seems that iterating over lists takes similar time:
In : list_iterator = iter(list(range(10 ** 7))) In : %timeit -n 1000000 next(list_iterator) 95.7 ns ± 7.13 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
It's even a bit faster then xrange, but still - it's 50% slower than iteration in Python 2.
This is actually a known phenomenon with Python 3.
You can read some possible explanation for it here - https://stackoverflow.com/questions/23453133/is-there-a-reason-python-3-enumerates-slower-than-python-2
(though, In my opinion, this answer doesn't seem correct - since we use only small numbers - lower than sys.maxint)
The answer in the above stack overflow link made me wonder - what will happen with very high numbers? Would range and list(range) still have similar speeds?
My intuition said that the results will be similar to the results we saw before - around 100 ns for each call to next.
But again - the best way to find things out is to try!
So let's check it:
In : list_iterator_high_numbers = iter(list(range(2 ** 64, 2 ** 64 + 10 ** 7))) In : %timeit -n 1000000 next(list_iterator_high_numbers) 97 ns ± 12.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
As we can see, the result for lists is very similar to the results before. No surprise here.
In : range_iterator_high_numbers = iter(range(2 ** 64, 2 ** 64 + 10 ** 7)) In : %timeit -n 1000000 next(range_iterator_high_numbers) 204 ns ± 9.15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Now, this is a surprise! Iterating on high numbers with range is much slower than iterating over low numbers. Calling next takes on average 100% more time!
Why does that happen?
I couldn't find any hint in the documentation, so I had to take the next step and
Another great thing about Python is that you can just simply read its source code. If you want to understand something more thoroughly, or if you are just a bit curious, you can read the code itself!
The relevant module we want to look at is Objects/rangeobject.c, which contains the implementation of range objects in Python 3.
You can find it here - https://github.com/python/cpython/blob/master/Objects/rangeobject.c
It's pretty easy to find the code of the "next" method, which is the method we are interested in:
The code is pretty straight-forward. If the range didn't reach its end yet, increase the index by one, multiply it by "step", add to "start" and return the value as a Python long object. (The casting issue is not relevant to us now).
It seems like an efficient code. Why is it less efficient for high numbers?
But wait a second. It doesn't work for numbers which are "longer" than long at all!
We just saw that range does support very high numbers, with more than 64 bits. long variables in C can only store numbers with up to 32 (or 64) bits. There must be some other code that handles the higher numbers.
Well, in my first look I missed the comment just above that method:
Look at all that code.
Instead of 3-4 basic arithmetic operations that we saw in rangeiter_next, here at longrangeiter_next there are much more complicated operations. You can't simply use "+" to add Python ints - you have to use PyNumber_Add. You can't use "*" to multiply the "index" and "step" - you have to use PyNumber_Multiple.
No wonder that calling next takes more time for "longer than long" numbers - much more things are happening under the hood.
But for the next method of regular lists, it doesn't matter what the values inside the list are - you don't need to do any "complicated" arithmetic operations like multiplying Python ints to get an item from a list.
So, what did we see here?
We started with a simple question - what is faster, xrange or range? and we saw that the answer is not so obvious.
We continued by comparing the speed of Python 2 and Python 3, and we were surprised to see that Python 3 is a bit slower in that area.
Then we looked at ranges with high numbers and saw that they are much slower than "lower" ranges. We had to explore the Python source code to find out the reason for that.
That's all for this post, I hope you enjoyed it.