A generator function looks very similar to a regular function, but there's one major difference: yield. When you include the yield keyword in a function, the function automatically becomes a generator function.
This means that when we call the generator function, the first thing it does is return a generator object without beginning execution at all. When we call next, only then does it begin execution, and it executes until it reaches a yield statement.
Here's a basic, useless example:
yield "King Arthur"
yield "Brave Sir Robin"
yield "Sir Galahad the Chaste"
u = useless()
# "King Arthur"
# "Brave Sir Robin"
# "Sir Galahad the Chaste"
You can see here that whenever you call next on the generator object, it continues executing until it reaches another yield statement. So what happens if you call next again?
Well, it raises a StopIteration exception.
$ python useless.py
Brave Sir Robin
Sir Galahad the Chaste
Traceback (most recent call last):
File "useless.py", line 10, in <module>
What is interesting about the generator function is that even though control is passed back to the caller, its state is frozen. Calling next simply resumes execution until it reaches another yield statement.
The value of using a generator for our purpose is clear. We can now write a generator function that yields one match at a time rather than loading up all of the matches in memory.
def find_matches(filenames, pattern):
for fname in filenames:
for line in open(fname):
if pattern in line:
We can call it the same way and get the same apparent results.
files = ['t1.txt', 't2.txt', 't3.txt']
for match in find_matches(files, 'the'):
The difference is our generator can handle extremely large files and many of them. Without a generator this would be extremely messy.