Python Regexes - findall, search, and match

Jeremy Jeremy (10)
Total time: 5 minutes 

This guide will cover the basics of how to use three common regex functions in Python - findall, search, and match. These three are similar, but they each have different a different purpose. This guide will not cover how to compose a regular expression so it assumes you are already somewhat familiar.

Posted in these interests:

python
PRIMARY
67 guides
regex
2 guides

The match function is used for finding matches at the beginning of a string only.

import re
re.match(r'hello', 'hello world')
# <_sre.SRE_Match at 0x1070055e0>

But keep in mind this only looks for matches at the beginning of the string.

re.match(r'world', 'hello world')
# None

Even if you're dealing with a multiline string and include a "^" to try to search at the beginning and use the re.MULTILINE flag, it will still only search the beginning of the string.

re.match(r'^hello', 'good morning\nhello world\nhello mom', re.MULTILINE)
# None

A great use case for re.match is testing a single pattern like a phone number or zip code. It's a good way to tell if your test string matches a desired pattern. This is a quick example of testing to make sure a string matches a desired phone number format.

if re.match(r'(\d{3})-(\d{3})-(\d{4})', '925-783-3005'):
    print "phone number is good"

If the string matches, a match object will be returned; otherwise it will return None.

You can read more about Python match objects if necessary.

This function is very much like match except it looks throughout the entire string and returns the first match. Taking our example from above:

import re
re.search(r'world', 'hello world')
# <_sre.SRE_Match at 0x1070055e0>

When using match this would return None, but using search we get our match object. This function is especially useful for determining if a pattern exists in a string. For instance, you might want to see if a line contains the word sandwich.

line = "I love to each sandwiches for lunch."
if re.search(r'sandwich', line):
    # <_sre.SRE_Match at 0x1070055e0>
    print "Found a sandwich"

Or maybe you want to take a block of text and find out if any of the lines begin with a number:

text = """
1. ricochet robots
2. settlers of catan
3. acquire
"""
match = re.search(r'\d+\.', text, re.MULTILINE)
match.group()
# '1.'

Again, this is very valuable for searching through an entire block of text to look for a match. If you're looking to find multiple occurrences of a pattern in a string, you should look at step 3 - findall.

Findall does what you would expect - it finds all occurrences of a pattern in a string. This is different from the previous two functions in that it doesn't return a match object. It simply returns a list of matches.

Using our board game example from above:

text = """
1. ricochet robots
2. settlers of catan
3. acquire
"""
re.findall(r'\d+\.', text, re.MULTILINE)
# ['1.', '2.', '3.']

As you can see, this returns a list of matches. If you don't use parentheses to capture any groups or if you only capture one group, the result will be a list of strings. If you capture more than one group, the result will be a list of tuples.

text = """
1. ricochet robots
2. settlers of catan
3. acquire
"""
re.findall(r'^(\d+)\.(.*)$', text, re.MULTILINE)
# [('1', ' ricochet robots'), ('2', ' settlers of catan'), ('3', ' acquire')]

In this case, we're capturing the number and the name of the game in two different groups.