根据给定任务编写收益生成器函数
Writing a yield generator function based on a given task
这是代码的一部分,应该运行 以 1000 条记录为一组进行记录搜索:
for subrange, batch in batched(records, size=1000):
print("Processing records %d-%d" %
(subrange[0], subrange[-1]))
process(batch)
我需要为它写一个 yield
生成器函数,到目前为止,我试过这样:
def batched(records, chunk_size=1000):
"""Lazy function (generator) to read records piece by piece.
Default chunk size: 1k."""
while True:
data = records.read(chunk_size)
if not data:
break
yield data
问题陈述如下:
For optimal performance, records should be processed in batches.
Create a generator function "batched" that will yield batches of 1000
records at a time
我也不太确定如何测试该功能,所以,有什么想法吗?
PS = batched
生成器函数应该在给定的 for subrange
循环之前。
def batched(records, chunk_size=1000):
"""Lazy function (generator) to read records piece by piece.
Default chunk size: 1k."""
pos = 0
while True:
data = records.read(chunk_size)
if not data:
break
yield ([pos, pos + len(data)], data )
pos += len(data)
您给定的循环代码
for subrange, batch in batched(records, size=1000):
print("Processing records %d-%d" %
(subrange[0], subrange[-1]))
process(batch)
对batched()
有隐含要求:
- 它应该 return 一个可迭代的。这确实是由生成器函数实现的。
- 产生的项目应该是元组
subrange, batch
。子范围似乎是所有元素的索引列表,只是开始和结束索引的列表或元组,或者可能是 range()
对象。我会假设后者。
唉,我们对给出的 records
对象一无所知。如果它有一个 read()
功能,你的方法可以调整:
def batched(records, size=1000):
"""Generator function to read records piece by piece.
Default chunk size: 1k."""
index = 0
while True:
data = records.read(size)
if not data:
break
yield range(index, index + len(data)), data
index += len(data)
但是如果 records
只是一个应该被分解的列表,你可以这样做
def batched(records, size=1000):
"""Generator function to read records piece by piece.
Default chunk size: 1k."""
index = 0
while True:
data = records[index:index + size]
if not data:
break
yield range(index, index + len(data)), data
index += len(data)
这是代码的一部分,应该运行 以 1000 条记录为一组进行记录搜索:
for subrange, batch in batched(records, size=1000):
print("Processing records %d-%d" %
(subrange[0], subrange[-1]))
process(batch)
我需要为它写一个 yield
生成器函数,到目前为止,我试过这样:
def batched(records, chunk_size=1000):
"""Lazy function (generator) to read records piece by piece.
Default chunk size: 1k."""
while True:
data = records.read(chunk_size)
if not data:
break
yield data
问题陈述如下:
For optimal performance, records should be processed in batches.
Create a generator function "batched" that will yield batches of 1000
records at a time
我也不太确定如何测试该功能,所以,有什么想法吗?
PS = batched
生成器函数应该在给定的 for subrange
循环之前。
def batched(records, chunk_size=1000):
"""Lazy function (generator) to read records piece by piece.
Default chunk size: 1k."""
pos = 0
while True:
data = records.read(chunk_size)
if not data:
break
yield ([pos, pos + len(data)], data )
pos += len(data)
您给定的循环代码
for subrange, batch in batched(records, size=1000):
print("Processing records %d-%d" %
(subrange[0], subrange[-1]))
process(batch)
对batched()
有隐含要求:
- 它应该 return 一个可迭代的。这确实是由生成器函数实现的。
- 产生的项目应该是元组
subrange, batch
。子范围似乎是所有元素的索引列表,只是开始和结束索引的列表或元组,或者可能是range()
对象。我会假设后者。
唉,我们对给出的 records
对象一无所知。如果它有一个 read()
功能,你的方法可以调整:
def batched(records, size=1000):
"""Generator function to read records piece by piece.
Default chunk size: 1k."""
index = 0
while True:
data = records.read(size)
if not data:
break
yield range(index, index + len(data)), data
index += len(data)
但是如果 records
只是一个应该被分解的列表,你可以这样做
def batched(records, size=1000):
"""Generator function to read records piece by piece.
Default chunk size: 1k."""
index = 0
while True:
data = records[index:index + size]
if not data:
break
yield range(index, index + len(data)), data
index += len(data)