Python语法-迭代器和生成器

迭代器

先看一个这样的示例

1
2
3
4
5
6
7
8
9
10
11
12
13
class Company:
def __init__(self,users):
self.users = users
self.index = 0

def __getitem__(self, item):
return self.users[item]


if __name__ == "__main__":
c = Company(["小明", "小红", "小刚"])
for u in c:
print(u)

这样原来的实例就可以遍历了,__getitem__有个参数item就是迭代的索引,这是因为编译器自动在内部自动生成了迭代器,相当于如下代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
from collections.abc import Iterator


class Company:
def __init__(self, users):
self.users = users
self.index = 0

def __iter__(self):
return MyIterator(self.users)


class MyIterator(Iterator):
def __init__(self, list):
self.index = 0
self.list = list

def __next__(self):
try:
item = self.list[self.index]
self.index += 1
except IndexError:
raise StopIteration
return item


if __name__ == "__main__":
c = Company(["小明", "小红", "小刚"])
for u in c:
print(u)

但是我们明明抛出异常了,但是并没有捕获处理啊!这是因为for in内部已经处理了

for in相当于转化为了如下代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Company:
def __init__(self, users):
self.users = users
self.index = 0

def __getitem__(self, item):
return self.users[item]


if __name__ == "__main__":
c = Company(["小明", "小红", "小刚"])
my_itor = iter(c)
while True:
try:
item = next(my_itor)
print(item)
except StopIteration:
pass

生成器

示例

只要方法中有yield,就会被解析为生成器

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import dis
def gen_func():
name = "小红"
yield name
name = "小明"
yield name
name = "小刚"


if __name__ == "__main__":
g = gen_func()
dis.dis(g)
print("-----------------")
print(g.gi_frame.f_lasti)
print(g.gi_frame.f_locals)
print("-----------------")
print(next(g))
print(g.gi_frame.f_lasti)
print(g.gi_frame.f_locals)
print("-----------------")
print(next(g))

其中

g = gen_func()获取生成器对象

dis.dis(g)能查看生成器对象的字节码执行过程

g.gi_frame.f_lasti能查看字节码执行的行号

g.gi_frame.f_locals能查看当时的环境变量

结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
  3           0 LOAD_CONST               1 ('小红')
2 STORE_FAST 0 (name)

4 4 LOAD_FAST 0 (name)
6 YIELD_VALUE
8 POP_TOP

5 10 LOAD_CONST 2 ('小明')
12 STORE_FAST 0 (name)

6 14 LOAD_FAST 0 (name)
16 YIELD_VALUE
18 POP_TOP

7 20 LOAD_CONST 3 ('小刚')
22 STORE_FAST 0 (name)
24 LOAD_CONST 0 (None)
26 RETURN_VALUE
-----------------
-1
{}
-----------------
小红
6
{'name': '小红'}
-----------------
小明

我们可以用for来遍历生成器的结果

1
2
3
4
5
6
7
8
9
10
11
12
13
def gen_func():
name = "小红"
yield name
name = "小明"
yield name
name = "小刚"
return name


if __name__ == "__main__":
g = gen_func()
for item in g:
print(item)

结果

1
2
小红
小明

可以看出我们只能接收yield返回的值

输入参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
def consumer():
r = '初始'
while True:
n = yield r
if n:
print(f'[CONSUMER] Consuming {n}')
r = f"receive value: {n}"
else:
print(f'[CONSUMER] no input')
r = f"receive value: no input"


c = consumer()
print("c.send(None):")
print(c.send(None))
print("启动了----")
print("c.send(1):")
print(c.send(1))
print("next(c):")
print(next(c))
print("c.send(2):")
print(c.send(2))

结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
c.send(None):
初始
启动了----
c.send(1):
[CONSUMER] Consuming 1
receive value: 1
next(c):
[CONSUMER] no input
receive value: no input
c.send(2):
[CONSUMER] Consuming 2
receive value: 2

Process finished with exit code 0

注意

  1. 生成器必须c.send(None)来启动,如果启动前直接c.send(1)会报错,启动的时候会先yield一次数据。

  2. image-20211130000215973

    之后会按照如图1,2,3的顺序运行。

  3. c.close()能够关闭生成器。

生产和消费

我们能发现程序在两个方法之前,来回切换,协程就是利用这个方式实现的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def consumer():
r = '初始'
while True:
n = yield r
if n:
print(f'[CONSUMER] Consuming {n}')
r = f"receive value: {n}"
else:
print(f'[CONSUMER] no input')
r = f"receive value: no input"


def produce(c):
c.send(None)
n = 0
while n < 5:
n = n + 1
print(f'[PRODUCER] Producing {n}...')
r = c.send(n)
print(f'[PRODUCER] Consumer return: {r}')
c.close()


c = consumer()
produce(c)

结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[PRODUCER] Producing 1...
[CONSUMER] Consuming 1
[PRODUCER] Consumer return: receive value: 1
[PRODUCER] Producing 2...
[CONSUMER] Consuming 2
[PRODUCER] Consumer return: receive value: 2
[PRODUCER] Producing 3...
[CONSUMER] Consuming 3
[PRODUCER] Consumer return: receive value: 3
[PRODUCER] Producing 4...
[CONSUMER] Consuming 4
[PRODUCER] Consumer return: receive value: 4
[PRODUCER] Producing 5...
[CONSUMER] Consuming 5
[PRODUCER] Consumer return: receive value: 5

Process finished with exit code 0

读取文件

读取多行文件

1
2
3
4
aaaaaa
bbb
ccc
dd

读取

1
2
3
with open("input.txt") as f:
for line in f.readlines():
print("line:" + line)

读取单行大文件

假如文件只有一行,并且文件较大,有固定的分隔符

1
aaaaaa[|]bbb[|]ccc[|]dd

读取代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def myreadlines(f, separator):
buf = ""
while True:
while separator in buf:
pos = buf.index(separator)
yield buf[:pos]
buf = buf[pos + len(separator):]
chunk = f.read(4096)
if not chunk:
yield buf
break
buf += chunk


with open("input.txt") as f:
for line in myreadlines(f, "[|]"):
print("line:" + line)