Python List Comprehensions by Example

List comprehension is a beautiful way to simplify code. According to the python documentation, "list comprehensions provide a concise way to create lists." In this guide, I'll walk through a few examples of how you can use list comprehensions be more expressive and simplify your code.

1

If you wanted to create a list of squares for the numbers between 1 and 10 you could do the following:

squares = []
for x in range(10):
   squares.append(x**2)

This is an easy example, but there is a much more concise way to write this using list comprehensions.

squares = [x**2 for x in range(10)]

The basic list comprehension is composed of square brackets surrounding an expression followed by a for statement. List comprehensions always return a list.

2

Normally you might try:

numbers = []
for x in range(100):
    if x % 3 == 0:
        numbers.append(x)

You can include an if statement in the list comprehension to conditionally include items. To create a list of numbers between 0 and 100 that are divisible by three, here is a way using a list comprehension:

numbers = [x for x in range(100) if x % 3 == 0]
3

It would take quite a few lines of code to accomplish this normally.

noprimes = []
for i in range(2, 8):
    for j in range(i*2, 50, i):
        noprimes.append(j)
primes = []
for x in range(2, 50):
    if x not in noprimes:
        primes.append(x)

However, you can simplify this by using two list comprehensions.

noprimes = [j for i in range(2, 8) for j in range(i*2, 50, i)]
primes = [x for x in range(2, 50) if x not in noprimes]

The first line uses multiple for loops within one list comprehension. The first for loop is the outer loop, and the second for loop is the inner loop. To find primes, we are first finding a list of non prime numbers. This list is generated by finding multiples of numbers 2-7. Then we loop through a range of numbers and check to see if each number is in the list of non primes.

Edit: as pointed out by shoyer on reddit, using a set for finding noprimes is much more efficient. Since noprimes should contain only unique values and we frequently have to check for the existence of a value, we should be using a set. Set comprehension has similar syntax to list comprehension so we can use the following:

noprimes = set(j for i in range(2, 8) for j in range(i*2, 50, i))
primes = [x for x in range(2, 50) if x not in noprimes]
4

Suppose you had a list of lists or a matrix,

matrix = [[0,1,2,3], [4,5,6,7], [8,9,10,11]]

and you want to flatten it into a single list. You could do so like this:

flattened = []
for row in matrix:
    for i in row:
        flattened.append(i)

And using list comprehension:

flattened = [i for row in matrix for i in row]

This uses two for loops to iterate through the entire matrix. The outer (first) for loop iterates through the row, and the inner (second) for loop iterates through each item i in the row.

5

Suppose you want to simulate a series of coin tosses where 0 is heads and 1 is tails. You could do this:

from random import random
results = []
for x in range(10):
    results.append(int(round(random())))

Or use a list comprehension to make it more concise:

from random import random
results = [int(round(random())) for x in range(10)]

This uses range to loop 10 times. Each time we round the output of random(). Since random() returns a float between 0 and 1, rounding the output will return either 0 or 1. Round() returns a float so we convert it to an integer using int() and add that value to the list.

6

Suppose you have a sentence,

sentence = 'Your mother was a hamster'

and you want to remove all of the vowels. We can do this easily in a few lines:

vowels = 'aeiou'
non_list = []
for l in sentence:
    if not l in vowels:
        non_list.append(l)
nonvowels = ''.join(non_list)

Or you can clean simplify using list comprehension:

vowels = 'aeiou'
nonvowels = ''.join([l for l in sentence if not l in vowels])

This example uses a list comprehension to create a list of letters from sentence that are not vowels. Then we pass the resulting list to join() to convert it to a string.

Edit: As noted by iamadogwhatisthis on reddit, this example doesn't require a list comprehension. A generator comprehension is more appropriate:

vowels = 'aeiou'
nonvowels = ''.join(l for l in sentence if not l in vowels)

Notice the missing square brackets. This is because join takes any iterable data to include lists or genetators. This syntax without square brackets uses generator comprehension. It produces the same result, but rather than packing all of the items into a list first it yields them as we iterate through. This prevents us from having to store the entire list into memory, and is more efficient for larger data.

7

The following code will iterate through the files in my_dir directory and append each one that has a txt extension.

import os
files = []
for f in os.listdir('./my_dir'):
    if f.endswith('.txt'):
        files.append(f)

This can be simplified with a list comprehension as well:

import os
files = [f for f in os.listdir('./my_dir') if f.endswith('.txt')]

Or you can get a list of the relative paths:

import os
files = [os.path.join('./my_dir', f) for f in os.listdir('./my_dir') if f.endswith('.txt')]

Courtesy of rasbt on reddit.

8

It's a frequent requirement to read in data from a csv file and process it. One of the most useful ways to process csv data is to turn it into a list of dictionaries.

import csv
data = []
for x in csv.DictReader(open('file.csv', 'rU')):
    data.append(x)

You can quickly do this with list comprehension:

import csv
data = [ x for x in csv.DictReader(open('file.csv', 'rU'))]

The DictReader class will automatically use of the first row of the csv file as the dictionary key names. This DictReader class returns an object that will iterate over the lines of the csv file. The file object is created by the open() function. We give open() two parameters - the name of the csv file first and the mode second. In this case, 'rU' means two things. As usual 'r' means to open the file in read mode. 'U' signifies that we will accept universal newlines - '\n', '\r', and '\r\n'.

Courtesy of blacwidonsfw on reddit.