After my Loop Better talk at PyGotham 2017 someone asked me a great question: iterators are lazy iterables and range
is a lazy iterable in Python 3, so is range
an iterator?
Unfortunately, I don’t remember the name of the person who asked me this question. I do remember saying something along the lines of “oh I love that question!”
I love this question because range
objects in Python 3 (xrange in Python 2) are lazy, but range objects are not iterators and this is something I see folks mix up frequently.
In the last year I’ve heard Python beginners, long-time Python programmers, and even other Python educators mistakenly refer to Python 3’s range
objects as iterators. This distinction is something a lot of people get confused about.
Yes this is confusing
When people talk about iterators and iterables in Python, you’re likely to hear the someone repeat the misconception that range
is an iterator. This mistake might seem unimportant at first, but I think it’s actually a pretty critical one. If you believe that range
objects are iterators, your mental model of how iterators work in Python isn’t clear enough yet. Both range
and iterators are “lazy” in a sense, but they’re lazy in fairly different ways.
With this article I’m going to explain how iterators work, how range
works, and how the laziness of these two types of “lazy iterables” differs.
But first, I’d like to ask that you do not use the information below as an excuse to be unkind to anyone, whether new learners or experienced Python programmers. Many people have used Python very happily for years without fully understanding the distinction I’m about to explain. You can write many thousands of lines of Python code without having a strong mental model of how iterators work.
What’s an iterator?
In Python an iterable is anything that you can iterate over and an iterator is the thing that does the actual iterating.
Iter-ables are able to be iterated over. Iter-ators are the agents that perform the iteration.
You can get an iterator from any iterable in Python by using the iter
function:
1 2 3 4 |
|
Once you have an iterator, the only thing you can do with it is get its next item:
1 2 3 4 5 |
|
And you’ll get a stop iteration exception if you ask for the next item but there aren’t anymore items:
1 2 3 4 |
|
Both conveniently and somewhat confusingly, all iterators are also iterables. Meaning you can get an iterator from an iterator (it’ll give you itself back). Therefore you can iterate over an iterator as well:
1 2 3 |
|
Importantly, it should be noted that iterators are stateful. Meaning once you’ve consumed an item from an iterator, it’s gone. So after you’ve looped over an iterator once, it’ll be empty if you try to loop over it again:
1 2 3 4 5 |
|
In Python 3, enumerate
, zip
, reversed
, and a number of other built-in functions return iterators:
1 2 3 4 5 6 |
|
Generators (whether from generator functions or generator expressions) are one of the simpler ways to create your own iterators:
1 2 3 4 |
|
I often say that iterators are lazy single-use iterables. They’re “lazy” because they have the ability to only compute items as you loop over them. And they’re “single-use” because once you’ve “consumed” an item from an iterator, it’s gone forever. The term “exhausted” is often used for an iterator that has been fully consumed.
That was the quick summary of what iterators are. If you haven’t encountered iterators before, I’d recommend reviewing them a bit further before continuing on. I’ve written an article which explains iterators and I’ve given a talk, Loop Better which I mentioned earlier, during which I dive a bit deeper into iterators.
How is range different?
Okay we’ve reviewed iterators. Let’s talk about range
now.
The range
object in Python 3 (xrange
in Python 2) can be looped over like any other iterable:
1 2 3 4 5 6 |
|
And because range
is an iterable, we can get an iterator from it:
1 2 |
|
But range
objects themselves are not iterators. We cannot call next
on a range
object:
1 2 3 4 |
|
And unlike an iterator, we can loop over a range
object without consuming it:
1 2 3 4 5 |
|
If we did this with an iterator, we’d get no elements the second time we looped:
1 2 3 4 5 |
|
Unlike zip
, enumerate
, or generator
objects, range
objects are not iterators.
So what is range?
The range
object is “lazy” in a sense because it doesn’t generate every number that it “contains” when we create it. Instead it gives those numbers to us as we need them when looping over it.
Here is a range
object and a generator (which is a type of iterator):
1 2 |
|
Unlike iterators, range
objects have a length:
1 2 3 4 5 6 |
|
And they can be indexed:
1 2 3 4 5 6 |
|
And unlike iterators, you can ask them whether they contain things without changing their state:
1 2 3 4 5 6 7 8 |
|
If you’re looking for a description for range
objects, you could call them “lazy sequences”. They’re sequences (like lists, tuples, and strings) but they don’t really contain any memory under the hood and instead answer questions computationally.
1 2 3 4 5 6 7 |
|
Why does this distinction matter?
It might seem like I’m nitpicking in saying that range isn’t an iterator, but I really don’t think I am.
If I tell you something is an iterator, you’ll know that when you call iter
on it you’ll always get the same object back (by definition):
1 2 |
|
And you’ll be certain that you can call next
on it because you can call next
on all iterators:
1 2 3 4 5 6 |
|
And you’ll know that items will be consumed from the iterator as you loop over it. Sometimes this feature can come in handy for processing iterators in particular ways:
1 2 3 |
|
So while it may seem like the difference between “lazy iterable” and “iterator” is subtle, these terms really do mean different things. While “lazy iterable” is a very general term without concrete meaning, the word “iterator” implies an object with a very specific set of behaviors.
When in doubt say “iterable” or “lazy iterable”
If you know you can loop over something, it’s an iterable.
If you know the thing you’re looping over happens to compute things as you loop over it, it’s a lazy iterable.
If you know you can pass something to the next
function, it’s an iterator (which are the most common form of lazy iterables).
If you can loop over something multiple times without “exhausting” it, it’s not an iterator. If you can’t pass something to the next
function, it’s not an iterator. Python 3’s range
object is not an iterator. If you’re teaching people about range
objects, please don’t use the word “iterator”. It’s confusing and might cause others to start misusing the word “iterator” as well.
On the other hand, if you see someone else misusing the word iterator don’t be mean. You may want to point out the misuse if it seems important, but keep in mind that I’ve heard long-time Python programmers and experienced Python educators misuse this word by calling range
objects iterators. Words are important, but language is tricky.
Thanks for joining me on this brief range
and iterator-filled adventure!