Effective Python: zip, enumerate, and iter

This short post looks at three built-in Python functions: zip, enumerate, and iter.

Let’s make an iterable, specifically a list, we can use in our examples.

lst = [chr(65+i) for i in range(26)]
# A, B, ..., Z

In old programming languages we iterate over lst using a loop and accessing each element separately.

for i in range(len(lst)):
    print(i, lst[i])

A modern, Pythonic approach iterates directly over the list.

i = 0
for x in lst:
    print(i, x)
    i += 1

Keeping track of the index manually is ugly. Is there a better way? One straw-man takes pairs from a range and lst. The zip function does just that: it produces an iterable yielding pairs (or generally tuples if you pass more arguments) until the shortest argument is exhausted. We can write:

for i, x in zip(range(len(lst)), lst):
    print(i, x)

To see what zip does, evaluate it into a list:

list(zip(range(4), lst[:4]))
# [(0, 'A'), (1, 'B'), (2, 'C'), (3, 'D')]

This pattern is so common that Python provides the enumerate function to achieve the same thing.

for i, x in enumerate(lst):
    print(i, x)

Again, to see what’s going on:

# [(0, 'A'), (1, 'B'), (2, 'C'), (3, 'D')

A handy use of zip is to create a dictionary from two iterables: one for keys and the other for values. For instance:

d = dict(zip(lst[::-1], lst))
# {'Z': 'A', 'Y': 'B', 'X': 'C', ... }

We can iterate over key/value pairs using the dict.items function.

for k, v in d.items():
    print(k, v)

What if we want the key/value pairs enumerated? A first guess might be:

for i, k, v in enumerate(d.items()):
    print(i, k, v)
# ValueError: not enough values to unpack (expected 3, got 2)

Again, forcing the result to a list shows what is going on:

# [(0, ('Z', 'A')), (1, ('Y', 'B')), (2, ('X', 'C')), ...

The enumeration produces pairs of values, an int and a tuple. The correct solution is

for i, (k, v) in enumerate(d.items()):
    print(i, k, v)

The iter function creates an iterator from an iterable, that is, an object on which next can be called. Iterators have many uses (this post shows how to use them to create graphics, for one). One clever application is to serve up elements of an iterator \(n\) at a time. For example, to get letters 5 at a time from lst we can use this pattern:

for v, w, x, y, z in zip(*[iter(lst)]*5):
    print(v, w, x, y, z)

How does this work? Let’s expand the code. It is equivalent to

it = iter(lst)
for v, w, x, y, z in zip(it, it, it, it, it):
    print(v, w, x, y, z)

As zip runs, it calls next on the same iterator it. The list [iter(lst)]*5 duplicates the same iterator five times. Calling as zip(*[iter(lst)]*5) expands the list into separate arguments for zip. We might quibble this pattern misses Z because zip stops when the shortest argument is exhausted.

The itertools library contains a version of zip that continues until the longest argument is exhausted, and returns a fillvalue to pad out the results.

from itertools import zip_longest
for i in zip_longest(*[iter(lst)]*5, fillvalue='-'):
# ('A', 'B', 'C', 'D', 'E')
# ('F', 'G', 'H', 'I', 'J')
# ('K', 'L', 'M', 'N', 'O')
# ('P', 'Q', 'R', 'S', 'T')
# ('U', 'V', 'W', 'X', 'Y')
# ('Z', '-', '-', '-', '-')

I find zip, enumerate, and iter to be very handy, and use them frequently.

Happy coding!

posted 2022-02-15 | tags: Effective Python, Python, iterables, enumerate, iter, zip

Share on