Loop Better

a deeper look at iteration in Python

/ @treyhunner

On-site training for Python/Django
Weekly Python Chat live webcast host
Python Morsels
San Diego Python meetup co-organizer
Django Girls San Diego co-organizer
Python Software Foundation director

Looping Problems

Looping Twice


>>> numbers = [1, 2, 3, 5, 7]
>>> squares = (n**2 for n in numbers)
>>> tuple(squares)
(1, 4, 9, 25, 49)
>>> sum(squares)
0
          

Containment Checking


>>> numbers = [1, 2, 3, 5, 7]
>>> squares = (n**2 for n in numbers)
>>> 9 in squares
True
>>> 9 in squares
False
          

Unpacking


>>> counts = {'apples': 2, 'oranges': 1}
>>> x, y = counts
>>> x
'apples'
          

Review Time

C-style for loop


let numbers = [1, 2, 3, 5, 7];
for (let i = 0; i < numbers.length; i += 1) {
    print(numbers[i])
}
          

1
2
3
5
7
          

Python for loop


numbers = [1, 2, 3, 5, 7]
for n in numbers:
    print(n)
          


1
2
3
5
7
          

Iterables are iter-able


for item in some_iterable:
    print(item)
          

Sequences


>>> numbers = [1, 2, 3, 5, 7]
>>> coordinates = (4, 5, 7)
>>> words = "hello there"
>>> numbers[0]
1
>>> coordinates[2]
7
>>> words[4]
'o'
          

Other iterables


>>> my_set = {1, 2, 3}
>>> my_dict = {'k1': 'v1', 'k2': 'v2'}
>>> my_file = open('some_file.txt')
>>> squares = (n**2 for n in my_set)
>>> from itertools import count
>>> c = count()
          

What we know

  • Python doesn't have traditional for loops
  • Python does have a flavor of foreach loops which we call for loops
  • Anything that can be looped over in Python is an iterable
  • Sequences are a special type of iterable
  • Not all iterables are sequences

How do for loops work?

Can we loop with indexes?


numbers = [1, 2, 3, 5, 7]
i = 0
while i < len(numbers):
    print(numbers[i])
    i += 1
          

1
2
3
5
7
          

We cannot loop with indexes


fruits = {'lemon', 'apple', 'orange', 'watermelon'}
i = 0
while i < len(fruits):
    print(fruits[i])
    i += 1
          

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
TypeError: 'set' object does not support indexing
          

Iterators power for loops

Iterables can give you iterators


>>> numbers = [1, 2, 3, 5, 7]
>>> coordinates = (4, 5, 7)
>>> words = "hello there"
>>> iter(numbers)
<list_iterator object at 0x7f2b9271c860>
>>> iter(coordinates)
<tuple_iterator object at 0x7f2b9271ce80>
>>> iter(words)
<str_iterator object at 0x7f2b9271c860>
          

Iterators can give the next item


>>> numbers = [1, 2, 3]
>>> iterator = iter(numbers)
>>> next(iterator)
1
>>> next(iterator)
2
>>> next(iterator)
3
>>> next(iterator)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
          

A For Loop


def funky_for_loop(iterable, action_to_do):
    for item in iterable:
        action_to_do(item)
          

Looping without a for loop


def funky_for_loop(iterable, action_to_do):
    iterator = iter(iterable)
    done_looping = False
    while not done_looping:
        try:
            item = next(iterator)
        except StopIteration:
            done_looping = True
        else:
            action_to_do(item)
          

The Iterator Protocol


for n in numbers:
    print(n)
          

x, y, z = coordinates
          

a, b, *rest = numbers
print(*numbers)
          

unique_numbers = set(numbers)
          

You've met iterators before

Generators are iterators


>>> numbers = [1, 2, 3]
>>> squares = (n**2 for n in numbers)
>>> next(squares)
1
>>> next(squares)
4
>>> squares = (n**2 for n in numbers)
>>> for n in squares:
...     print(n)
...
1
4
9
          

I lied

Iterators are iterables


>>> numbers = [1, 2, 3]
>>> iterator = iter(numbers)
>>> iterator2 = iter(iterator)
>>> iterator2
<listiterator object at 0x7f92db9bf350>
>>> iterator is iterator2
True
          

Iterators are their own iterators


def is_iterator(iterable):
    return iter(iterable) is iterable
          

Iterators are single-purposed


>>> numbers = [1, 2, 3, 5, 7]
>>> iterator = iter(numbers)
>>> len(iterator)
TypeError: object of type 'list_iterator' has no len()
>>> iterator[0]
TypeError: 'list_iterator' object is not subscriptable
>>> next(iterator)
1
>>> list(iterator)
[2, 3, 5, 7]
>>> list(iterator)
[]

          

The truth

Object Iterable? Iterator?
Iterable✔️
Iterator✔️✔️
Generator✔️✔️
List✔️

So...why do we care?

Iterators enable laziness

  1. Iterators allow for lazily evaluated iterables
  2. Iterators allow for infinitely long iterables
  3. Iterators allow us to save memory and (sometimes) time

Iterators are everywhere


>>> letters = ['a', 'b']
>>> next(enumerate(letters))
(0, 'a')
>>> next(zip(letters, letters))
('a', 'a')
>>> next(open('hello.txt'))
'hello world\n'
          

Creating your own iterator


class square_all:
    def __init__(self, numbers):
        self.numbers = iter(numbers)
    def __next__(self):
        return next(self.numbers) ** 2
    def __iter__(self):
        return self
          

def square_all(numbers):
    for n in numbers:
        yield n**2
          

def square_all(numbers):
    return (n**2 for n in numbers)
          

think lazy


hours_worked = 0
for event in events:
    if event.is_billable():
        hours_worked += event.duration
          

billable_times = (
    event.duration
    for event in events
    if event.is_billable()
)

hours_worked = sum(billable_times)
          

Think lazy


for i, line in enumerate(log_file):
    if i >= 10:
        break
    print(line)
          

from itertools import islice

first_ten_lines = islice(log_file, 10)
for line in first_ten_lines:
    print(line)
          

Think Lazy


previous = readings[0]
for current in readings[1:]:
    differences.append(current - previous)
    previous = current
          

from my_fancy_utils_module import with_previous

differences = []
for previous, current in with_previous(readings):
    differences.append(current - previous)
          

THINK LAZY


def with_previous(iterable):
    """Yield (previous, current) tuples, starting with second."""
    iterator = iter(iterable)
    previous = next(iterator)
    for item in iterator:
        yield previous, item
        previous = item
          

Looping Problems

Revisited

Exhausted

Looping Twice


>>> numbers = [1, 2, 3, 5, 7]
>>> squares = (n**2 for n in numbers)
>>> tuple(squares)
(1, 4, 9, 25, 49)
>>> sum(squares)
0
>>> tuple(squares)
()
          

Partially-Consumed

Containment Checking


>>> numbers = [1, 2, 3, 5, 7]
>>> squares = (n**2 for n in numbers)
>>> 9 in squares
True
>>> 9 in squares
False
>>> squares = (n**2 for n in numbers)
>>> 9 in squares
True
>>> list(squares)
[25, 49]
          

Unpacking


>>> counts = {'apples': 2, 'oranges': 1}
>>> for key in counts:
...     print(key)
...
apples
oranges
>>> x, y = counts
>>> x, y
('apples', 'oranges')
          

Remember

  • Iterators are the most rudimentary form of iterables
  • Not all iterables are sequences
  • Assume your iterable can be iterated over and that's it
  • When you need a new lazy iterable, make a generator
  • Every form of iteration relies on the iterator protocol

Things to Google

Trey Hunner / @treyhunner
Python & Django Team Trainer
http://truthful.technology

Tally counter image copyright Linda Spashett CC BY
Hello Kitty PEZ dispenser image copyright Deborah Austin CC BY