CS 200: Iterators and Generators¶

Reading:

  • Learning Python, Chapter 14
  • Python Cookbook, Chapter 4

The concept of “iterable objects” is relatively recent in Python, but it has come to permeate the language’s design. It’s essentially a generalization of the notion of sequences—an object is considered iterable if it is either a physically stored sequence, or an object that produces one result at a time in the context of an iteration tool like a for loop. In a sense, iterable objects include both physical sequences and virtual sequences computed on demand.

Below we iterate over a file implicitly using a for loop:

In [1]:
def it0(file="iterators.py"):
    count = 0
    for line in open(file):
        count += 1
        if count < 10:
            print (line, end='')
In [2]:
it0()
#! /usr/bin/python3

'''
From Learning Python, Chapter 14

The concept of “iterable objects” is relatively recent in Python,
but it has come to permeate the language’s design. It’s essentially a
generalization of the notion of sequences—an object is considered
iterable if it is either a physically stored sequence, or an object

Next, we iterate explicitly using next and readline. Note that print() adds a newline.

In [3]:
def it1(file="iterators.py"):
    f = open(file)
    print(f.readline())
    print(f.readline())
    print(f.__next__())
    print(f.__next__())
    print(next(f))
    print(next(f))
    f.close()
In [4]:
it1()
#! /usr/bin/python3



'''

From Learning Python, Chapter 14



The concept of “iterable objects” is relatively recent in Python,

\_\_next\_\_() and readline() are equivalent for files. They each iterate over the file a line at a time.

\_\_next\_\_() is an ITERATOR in Python. The for loop calls \_\_next\_\_()

next(f)calls the iterator for f

Can explicitly get (or make) an iterator using iter()

In [5]:
def it2(lst = [1,2,3,4]):
    it = iter(lst)
    print (it.__next__())
    print (next(it))
    print (next(it))
In [6]:
it2()
1
2
3
In [7]:
it2([1,2,3,4,5])
1
2
3
In [8]:
it2([1,2])
1
2
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
Input In [8], in <cell line: 1>()
----> 1 it2([1,2])

Input In [5], in it2(lst)
      3 print (it.__next__())
      4 print (next(it))
----> 5 print (next(it))

StopIteration: 

Below we manually implement iteration.

In [9]:
def it4(lst = [1,2,3,4]):
    it = iter(lst)
    while True:
        try:
            x = next(it)
        except StopIteration:
            break
        print (x, ' ', end='')
In [10]:
it4()
1  2  3  4  
In [11]:
it4([1,2,3,4,5,6,7,8])
1  2  3  4  5  6  7  8  
In [12]:
it4([])
In [13]:
it4('abcdef')
a  b  c  d  e  f  

Other built-in iterators include dictionaries.

In [14]:
def it5(dict = {'a':1, 'b':2, 'c': 3}):
    for key in dict.keys():
        print (key, dict[key])
In [15]:
it5()
a 1
b 2
c 3
In [16]:
it5({})
In [17]:
it5({'john':20, 'mary':30, 'joe':45, 'anne':23})
john 20
mary 30
joe 45
anne 23

Range objects are implicit lists.

In [18]:
def it6(n = 6):
    R = range(n)
    it = iter(R)
    print (next(it))
    print (next(it))
    print (list(it))
In [19]:
it6()
0
1
[2, 3, 4, 5]
In [20]:
it6(1)
0
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
Input In [20], in <cell line: 1>()
----> 1 it6(1)

Input In [18], in it6(n)
      3 it = iter(R)
      4 print (next(it))
----> 5 print (next(it))
      6 print (list(it))

StopIteration: 
In [21]:
it6(2)
0
1
[]

Enumerate objects are iterable.

In [22]:
def it7(text = "hello, world!"):
    it = iter(enumerate(text))
    print (next(it))
    print (next(it))
    print (list(it))
In [23]:
it7()
(0, 'h')
(1, 'e')
[(2, 'l'), (3, 'l'), (4, 'o'), (5, ','), (6, ' '), (7, 'w'), (8, 'o'), (9, 'r'), (10, 'l'), (11, 'd'), (12, '!')]

You can iterate through the output of unix commands.

In [24]:
import os
def it8(cmd = 'ls -l'):
    p = os.popen(cmd)
    count = 0
    for x in p:
        count += 1
        if count < 10:
            print (x, end='')
In [25]:
it8()
total 31924
-rw-r--r-- 1 sbs5 cs200ta    5619 Aug 31 12:46 0831.html
-rw-rw-r-- 1 sbs5 sbs5     620327 Aug  1 18:57 0831nb.html
-rw-rw-r-- 1 sbs5 cs200ta   33828 Aug  1 18:57 0831nb.ipynb
-rw-rw-r-- 1 sbs5 cs200ta    1834 Aug 31 15:32 0831.script
-rw-r--r-- 1 sbs5 cs200ta    4919 Oct  7 08:40 0902.html
-rw-rw-r-- 1 sbs5 cs200ta   24567 Sep  2 15:38 0902.script
-rw-r--r-- 1 sbs5 cs200ta    4567 Oct  7 08:40 0907.html
-rw-rw-r-- 1 sbs5 cs200ta   27126 Sep  7 16:08 0907.script

map is an iterable.

In [26]:
def it9(lst = [1,2,3,4,5,6,7,8]):
    x = map(lambda x: x*x, lst)
    print (x)
    for e in x:
        print (e)
In [27]:
it9()
<map object at 0x7f2ce6732c20>
1
4
9
16
25
36
49
64
In [28]:
def it10(lst = [1,2,3,4,5,6,7,8]):
    x = map(lambda x: x*x, lst)
    try:
        while x:
            print (next(x))
    except:
        print ('done')
In [29]:
it10()
1
4
9
16
25
36
49
64
done

filter is an iterable.

In [30]:
def it11(lst = [1,2,3,4,5,6,7,8]):
    x = filter(lambda x: x%2, lst)
    print (x)
    for e in x:
        print (e)
In [31]:
it11()
<filter object at 0x7f2ce6732c50>
1
3
5
7
In [32]:
def it12(lst = [1,2,3,4,5,6,7,8]):
    x = filter(lambda x: x%2, lst)
    try:
        while x:
            print (next(x))
    except:
        print ('done')
In [33]:
it12()
1
3
5
7
done

zip !¶

zip is an iterable.

zip combines two or more lists into a single list of tuples.

In [34]:
z = zip([1,2,3], [4,5,6])
In [35]:
z
Out[35]:
<zip at 0x7f2ce5d24980>
In [36]:
list(z)
Out[36]:
[(1, 4), (2, 5), (3, 6)]
In [37]:
list(zip([1,2,3], [4,5,6], [7,8,9]))
Out[37]:
[(1, 4, 7), (2, 5, 8), (3, 6, 9)]

If the list arguments are not the same length, zip truncates the longer list.

In [38]:
list(zip([1,2,3,4], [5,6,7]))
Out[38]:
[(1, 5), (2, 6), (3, 7)]
In [39]:
list(zip([1,2,3,4,5,6,7], [7,6,5,4,3,2,1], [1,1,1]))
Out[39]:
[(1, 7, 1), (2, 6, 1), (3, 5, 1)]
In [40]:
def it13(lst1 = [1,2,3,4], lst2=[5,6,7,8]):
    x = zip(lst1, lst2)
    print(x)
    for a,b in x:
        print (a,b)
In [41]:
it13()
<zip object at 0x7f2ce5d5c140>
1 5
2 6
3 7
4 8
In [42]:
def it14(lst1 = [1,2,3,4], lst2=[5,6,7,8]):
    x = zip(lst1, lst2)
    print(x)
    try:
        while x:
            print (next(x))
    except:
        print ('done')
In [43]:
it14()
<zip object at 0x7f2ce5d5c600>
(1, 5)
(2, 6)
(3, 7)
(4, 8)
done

Synchronized vs independent iterators¶

You may have multiple iterators over the same objects. Some are synchronized, and some are not.

zip iterators are synchronized.

In [44]:
def it15():
    z = zip([1,2,3], [4,5,6])
    i1 = iter(z)
    i2 = iter(z)
    print (next(i1))
    print (next(i1))
    print (next(i2))
In [45]:
it15()
(1, 4)
(2, 5)
(3, 6)

map iterators are synchronized.

In [46]:
def it16():
    z = map(lambda x: x*x, [1,2,3])
    i1 = iter(z)
    i2 = iter(z)
    print (next(i1))
    print (next(i1))
    print (next(i2))
In [47]:
it16()
1
4
9

Range iterators are independent!

In [48]:
def it17():
    z = range(10)
    i1 = iter(z)
    i2 = iter(z)
    print (next(i1))
    print (next(i1))
    print (next(i2))
In [49]:
it17()
0
1
0

Class iteration.¶

Simple iteration in a class

In [50]:
class MyNumbers:
  def __iter__(self):
    self.a = 1
    return self

  def __next__(self):
    x = self.a
    self.a += 1
    return x
In [51]:
myclass = MyNumbers()
myiter = iter(myclass)

print(next(myiter))
print(next(myiter))
print(next(myiter))
print(next(myiter))
print(next(myiter))
1
2
3
4
5
In [52]:
next(myiter)
Out[52]:
6
In [53]:
next(myiter)
Out[53]:
7

iteration using deep recursion - actually deep iteration¶

What would it mean to have an iterator for the person class?

To iterate over a person's friends and family?

In [54]:
class Node:
    def __init__(self, value):
        self.value = value
        self.children = []

    def __repr__(self):
        return 'Node({!r})'.format(self.value)

    def add_child(self, node):
        self.children.append(node)

    ## a depth first traversal of the tree
    def __iter__(self):
        print ("Calling from: " + str(self.value))
        # first, yield everthing every one of the child nodes would yield.
        for child in self.children:
            for item in child:
                # the two for loops is because there're multiple children,
                # and we need to iterate over each one.
                yield item

        # finally, yield self
        yield self
In [55]:
def it18():
    root = Node(0)
    child1 = Node(1)
    child2 = Node(2)
    child3 = Node(3)
    child4 = Node(4)
    child5 = Node(5)
    child6 = Node(6)
    root.add_child(child1)
    root.add_child(child2)
    child1.add_child(child3)
    child3.add_child(child4)
    child4.add_child(child5)
    child5.add_child(child6)

    for ch in root:
        print (ch)
In [56]:
it18()
Calling from: 0
Calling from: 1
Calling from: 3
Calling from: 4
Calling from: 5
Calling from: 6
Node(6)
Node(5)
Node(4)
Node(3)
Node(1)
Calling from: 2
Node(2)
Node(0)
In [57]:
grandpa = Node('Grandpa')
dad = Node('Dad')
uncle = Node('Uncle')
aunt = Node('Aunt')
son = Node('son')
daughter = Node('daughter')
grandpa.add_child(dad)
grandpa.add_child(uncle)
grandpa.add_child(aunt)
dad.add_child(son)
dad.add_child(daughter)
In [58]:
for offspring in grandpa:
    print (offspring)
Calling from: Grandpa
Calling from: Dad
Calling from: son
Node('son')
Calling from: daughter
Node('daughter')
Node('Dad')
Calling from: Uncle
Node('Uncle')
Calling from: Aunt
Node('Aunt')
Node('Grandpa')

Generators¶

Define an iterator using yield:

In [59]:
class yrange:
    def __init__(self, n):
        self.i = 0
        self.n = n

    def __iter__(self):
        while self.i < self.n:
            i = self.i
            self.i += 1
            yield i
        else:
            ## raise StopIteration()
            pass
In [60]:
def it19():
    print (list(yrange(5)))
    print (sum(yrange(5)))
    y = iter(yrange(3))
    print (next(y))
    print (next(y))
    print (next(y)) 
In [61]:
it19()
[0, 1, 2, 3, 4]
10
0
1
2

We can use generators to create iterators.

In [62]:
def zrange(n):
    i = 0
    while i < n:
        yield i
        i += 1
In [63]:
def it20():
    print (list(zrange(5)))
    print (sum(zrange(5)))
    z = zrange(3)
    print (next(z))
    print (next(z))
    print (next(z))
In [64]:
it20()
[0, 1, 2, 3, 4]
10
0
1
2

How yield and next operate together:

In [65]:
def foo():
    print ("begin")
    for i in range(3):
        print ("before yield", i)
        yield i
        print ("after yield", i)
    print ("end")
In [66]:
def it23():
    f = foo()
    print (f)
    next(f)
    next(f)
    next(f)
    next(f)
In [67]:
it23()
<generator object foo at 0x7f2ce5b80d60>
begin
before yield 0
after yield 0
before yield 1
after yield 1
before yield 2
after yield 2
end
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
Input In [67], in <cell line: 1>()
----> 1 it23()

Input In [66], in it23()
      5 next(f)
      6 next(f)
----> 7 next(f)

StopIteration: 

Infinite data objects using generators¶

In [68]:
def integers():
    """Infinite sequence of integers."""
    i = 1
    while True:
        yield i
        i = i + 1
In [69]:
x = integers()
In [70]:
for i in range(10):
    print (next(x), end=' ')
1 2 3 4 5 6 7 8 9 10 
In [71]:
for i in range(10):
    print (next(x), end=' ')
11 12 13 14 15 16 17 18 19 20 
In [72]:
for i in range(10):
    print (next(x), end=' ')
21 22 23 24 25 26 27 28 29 30 

Infinite squares.

In [73]:
def squares():
    for i in integers():
        yield i * i
In [74]:
s = squares()

for i in range(10):
    print (next(s), end=' ')
1 4 9 16 25 36 49 64 81 100 
In [75]:
for i in range(10):
    print (next(s), end=' ')    
121 144 169 196 225 256 289 324 361 400 

take will return the first n values from the given sequence.

In [76]:
def take(n, seq):
    """Returns first n values from the given sequence."""
    seq = iter(seq)
    result = []
    try:
        for i in range(n):
            result.append(next(seq))
    except StopIteration:
        pass
    return result
In [77]:
take(10, integers())
Out[77]:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
In [78]:
take(10, squares())
Out[78]:
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

Added finally clause to return value even if exception is thrown.

In [79]:
def take2(n, seq):
    """Returns first n values from the given sequence."""
    seq = iter(seq)
    result = []
    try:
        for i in range(n):
            result.append(next(seq))
    except StopIteration:
        pass
    finally: 
        return result
In [80]:
def it24():
    print (take(5,squares()))
In [81]:
it24()
[1, 4, 9, 16, 25]

Can use comprehensions to create generators. Use () instead of []

In [82]:
g = (x*x for x in range(100))
In [83]:
def it25():
    print (take(10,g))
In [84]:
it25()
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
In [85]:
it25()
[100, 121, 144, 169, 196, 225, 256, 289, 324, 361]

Our old prime number friends.

In [86]:
noprimes = [j for i in range(2, 8) for j in range(i*2, 100, i)]
yesprimes = (x for x in range(2, 100) if x not in noprimes)

Note: does not work if noprimes is a generator.

In [87]:
def it26():
    print (take(10,yesprimes))
In [88]:
it26()
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

End of Iterators and Generators notebook.