How to Use the Python Zip and Unzip Function

John John (304)
5 minutes

Python's zip function allows us to easily map elements of the same index in multiple containers. In a very contrived example, imagine if you have a list of first names and a list of last names, where the indexes of each correspond.

first_names = ['George', 'Benjamin', 'Thomas']
last_names = ['Washington', 'Franklin', 'Jefferson']

Zip allows us to combine these two lists, mapping the elements by index.

zipped = zip(first_names, last_names)
print(list(zipped))

>> [('George', 'Washington'), ('Benjamin', 'Franklin'), ('Thomas', 'Jefferson')]

In this guide, we'll learn how the zip function works in Python 3 and see some examples demonstrating how to zip and unzip.

Posted in these interests:
h/python67 guides
h/code69 guides

In Python 3, zip is a function that aggregates elements from multiple iterables and returns an iterator.

Using the example from the introduction, you'll notice that I converted the zipped object to a list so we could visualize the output. But the zip function actually returns a zip object, which is an iterator. This offers major performance gains if we're wanting to zip extremely large iterables.

first_names = ['George', 'Benjamin', 'Thomas']
last_names = ['Washington', 'Franklin', 'Jefferson']

zipped = zip(first_names, last_names)

next(zipped)                                                                                       
>> ('George', 'Washington')

next(zipped)                                                                                       
>> ('Benjamin', 'Franklin')

next(zipped)                                                                                       
>> ('Thomas', 'Jefferson')

Note: The next function is designed to retrieve the next element from an iterator.

You'll notice that the zip function returns a iterator where each element is a tuple containing the merged elements from each iterable.

Associating column names with query results

Imagine a database library that executes queries and only returns a list of tuples containing the values, which keeps the footprint small (the bigquery library does something like this). There's a little bit of hand waving here, but stick with me.

So we've got a list containing the table schema:

schema = ['id', 'first_name', 'last_name']

And the query results look like this:

query_results = [
    (1, 'Thomas', 'Sowell',),
    (2, 'Murray', 'Rothbard',),
    (3, 'Friedrich', 'Hayek',),
    (4, 'Adam', 'Smith',),
]

Depending on what we want to do with this data, we may want to turn this into a list of dictionaries, where the keys are the column names and the values are the corresponding query results.

Zip is our friend.

dict_results = [dict(zip(schema, row)) for row in query_results]

>> [{'id': 1, 'first_name': 'Thomas', 'last_name': 'Sowell'},
 {'id': 2, 'first_name': 'Murray', 'last_name': 'Rothbard'},
 {'id': 3, 'first_name': 'Friedrich', 'last_name': 'Hayek'},
 {'id': 4, 'first_name': 'Adam', 'last_name': 'Smith'}]

Combining query string lists

Imagine we've got a front-end application that makes a GET request and passes a few lists in the query. And in our case, the elements of each list correspond to one another.

https://my-site.com/steps?title=step%20one&title=step%202&slug=step-one&slug=step-two

In our example request, there are two titles and two slugs in the query string. On the backend, we may want to associate them, and we can use zip to do this!

data = list(zip(request.GET.getlist('title'), request.GET.getlist('slug')))

>> [('step one', 'step-one'), {'step two', 'step-two')]

Ok, so now we've got a list of tuples and we want to pull elements of corresponding indexes into their own tuples.

Imagine we've got a list of tuples that represents query results. The first value is the month, the second is the total revenue:

results = [
    ('January', 35423.85,),
    ('February', 31445.75,),
    ('March', 38525.22,),
]

Suppose we want to get the total revenue for the first quarter. We can unzip these results, then sum the revenue.

months, revenue = zip(*results)

print(revenue)

>> (35423.85, 31445.75, 38525.22)

print(sum(revenue))

>> 105394.82

Beautiful.

If we pass multiples iterables of different lengths, the default behavior is to zip the number of elements in the shortest iterable.

short_list = list(range(5))  # [0, 1, 2, 3, 4]
long_list = list(range(10,20))  # [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

list(zip(short_list, long_list))

>> [(0, 10), (1, 11), (2, 12), (3, 13), (4, 14)]

Notice that elements 15-19 from long_list were ignored.

If you happen to care about the trailing elements, you can use itertools.zip_longest.

import itertools

list(itertools.zip_longest(short_list, long_list))

>> [(0, 10), (1, 11), (2, 12), (3, 13), (4, 14), (None, 15), (None, 16), (None, 17), (None, 18), (None, 19)]

Hopefully by now you understand how to use the zip function in Python. If you've got any questions, let me know in the comments below!

Also, feel free to share interesting ways you've used the zip function. I'd be happy to add them to this guide.

John John (304)
0

If you're familiar with Python's keyword-only arguments, then you've probably wondered why the same constraint doesn't exist for positional arguments. This changes with Python 3.