Python Regexes - How to Match Objects in Python

Tyler Tyler (288)
Total time: 5 minutes 

In Python, regular expression matches can be returned in the form of a match object. In this guide, I'll cover the basics of how to make use of a match object.

Posted in these interests:
h/python67 guides
h/regex2 guides

A group is a pattern that you want to capture in a string. Let's use re.match to capture the first and last name in a string.

import re
m = re.match(r"(\w+) (\w+)", "Adam Smith")

m is a match object, and this object gives us access to a method called group.
# 'Adam Smith'

Calling will return the entire matched pattern. Calling it this way is the same as calling But if you want to specify a single group you can do so. We know that we are expecting two separate groups so we can call and
# 'Adam'
# 'Smith'

Using the (?P<name>) syntax, you can even access the group by name:

import re
m = re.match(r"(?P<first>\w+) (?P<last>\w+)", "Adam Smith")'first')
# 'Adam''last')
# 'Smith'

Using groups we can return a tuple of the captures groups. Suppose we wanted to parse a telephone number.

import re
m = re.match(r"(\d{3})[.|\-]?(\d{3})[.|\-]?(\d{4})", "925.783.3005")
# ('925', '783', '3005')

A neat thing about the groups method is that you can pass in a default value. Let's make the area code optional. If not provided, the default value will be None.

import re
m = re.match(r"(\d{3})?[.|\-]?(\d{3})[.|\-]?(\d{4})", "783.3005")
# (None, '783', '3005')

Let's pass in a default value now:

import re
m = re.match(r"(\d{3})?[.|\-]?(\d{3})[.|\-]?(\d{4})", "783.3005")
# ('xxx', '783', '3005')

Now let's look at groupdict. This returns our groups in the form of a dictionary.

import re
m = re.match(r"(?P<first>\w+) (?P<last>\w+)", "Adam Smith")
# {'first': 'Adam', 'last': 'Smith'}

Like groups you can pass in a default value for groups that did not participate in the match. Make the space and the last name optional, and you'll see the default value take effect:

import re
m = re.match(r"(?P<first>\w+) ?(?P<last>\w+)?", "Adam")
# {'first': 'Adam', 'last': False}