如何从嵌套列表中分块 n 段
How to chunk n segments from a nestled list
我正在尝试从嵌套列表中分块 100 个列表。我查看了 Stack Overflow 上的多个示例,但仍然无法正常工作。
我的主列表名为 data_to_insert,它包含其他列表。我想从主嵌套列表中提取(块)100 个列表。
我该如何完成?
这是我当前的代码,无法按需运行。
def divide_chunks(l, n):
for i in range(0, len(l), n):
yield l[i:i + n]
n = 100
x = list(divide_chunks(data_to_insert, 100))
嵌套列表示例:
data_to_insert = [['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'],
...
[thousands of others lists go here]]
期望的输出是另一个列表 (sliced_data),其中包含嵌套列表 (data_to_insert) 中的 100 个列表。
sliced_data = [['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'],
...
[98 more lists go here]]
我需要遍历嵌套列表,data_to_insert直到它为空。
您可以使用给定列表中的 random
到 select 100
随机嵌套列表。
这将从原始列表中输出 3
随机嵌套列表,
import random
l = [[1,2], [3,4], [1,1], [2,3], [3,5], [0,0]]
print(random.sample(l, 3))
# output,
[[3, 4], [1, 2], [2, 3]]
如果您不想要列表输出,请将 print(random.sample(l, 3))
替换为 print(*random.sample(l, 3))
、
# output,
[1, 2] [2, 3] [1, 1]
如果你只想先 100
嵌套列表然后做,
print(l[:100])
如果我没有正确理解你的问题,你需要首先展平你的列表列表,然后创建它的块。这是一个使用 itertools module
中的 chain.from_iterable
的示例以及您用来创建区块的代码:
from itertools import chain
def chunks(elm, length):
for k in range(0, len(elm), length):
yield elm[k: k + length]
my_list = [['item{}'.format(j) for j in range(7)]] * 1000
flattened = list(chain.from_iterable(my_list))
chunks = list(chunks(flattened, 100))
print(len(chunks[10]))
输出:
100
经过一些耗时的研究,我开发了一个有效的解决方案。下面的解决方案循环遍历列表列表并提取 100 个列表。
# Verifies that the list data_to_insert isn't empty
if len(data_to_insert) > 0:
# Obtains the length of the data to insert.
# The length is the number of sublists
# contained in the main nestled list.
data_length = len(data_to_insert)
# A loop counter used in the
# data insert process.
i = 0
# The number of sublists to slice
# from the main nestled list in
# each loop.
n = 100
# This loop execute a set of statements
# as long as the condition below is true
while i < data_length:
# Increments the loop counter
if len(data_to_insert) < 100:
i += len(data_to_insert)
else:
i += 100
# Slices 100 sublists from the main nestled list.
sliced_data = data_to_insert[:n]
# Verifies that the list sliced_data isn't empty
if len(sliced_data) > 0:
# Removes 1000 sublists from the main nestled list.
data_to_insert = data_to_insert[n:]
##################################
do something with the sliced_data
##################################
# Clears the list used to store the
# sliced_data in the insertion loop.
sliced_data.clear()
gc.collect()
# Clears the list used to store the
# data elements inserted into the
# database.
data_to_insert.clear()
gc.collect()
我开发了第二种方法来完成我的 objective,它基于 Sufiyan Ghori 使用 random.
的建议
if len(my_nestled_list) > 0:
# Obtains the length of the data to insert.
# The length is the number of sublists
# contained in the main nestled list.
data_length = len(my_nestled_list))
# A loop counter used in the
# data insert process.
i = 0
# The number of sublists to slice
# from the main nestled list in
# each loop.
n = 100
# This loop execute a set of statements
# as long as the condition below is true
while i < data_length:
# Increments the loop counter
if len(my_nestled_list)) < 100:
i += len(my_nestled_list))
else:
i += 100
# Uses list comprehension to randomly select 100 lists
# from the nestled list.
random_sample_of_100 = [my_nestled_list)[i] for i in sorted(random.sample(range(len(my_nestled_list))), n))]
print (random_sample_of_100)
我正在尝试从嵌套列表中分块 100 个列表。我查看了 Stack Overflow 上的多个示例,但仍然无法正常工作。
我的主列表名为 data_to_insert,它包含其他列表。我想从主嵌套列表中提取(块)100 个列表。
我该如何完成?
这是我当前的代码,无法按需运行。
def divide_chunks(l, n):
for i in range(0, len(l), n):
yield l[i:i + n]
n = 100
x = list(divide_chunks(data_to_insert, 100))
嵌套列表示例:
data_to_insert = [['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'],
...
[thousands of others lists go here]]
期望的输出是另一个列表 (sliced_data),其中包含嵌套列表 (data_to_insert) 中的 100 个列表。
sliced_data = [['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'],
...
[98 more lists go here]]
我需要遍历嵌套列表,data_to_insert直到它为空。
您可以使用给定列表中的 random
到 select 100
随机嵌套列表。
这将从原始列表中输出 3
随机嵌套列表,
import random
l = [[1,2], [3,4], [1,1], [2,3], [3,5], [0,0]]
print(random.sample(l, 3))
# output,
[[3, 4], [1, 2], [2, 3]]
如果您不想要列表输出,请将 print(random.sample(l, 3))
替换为 print(*random.sample(l, 3))
、
# output,
[1, 2] [2, 3] [1, 1]
如果你只想先 100
嵌套列表然后做,
print(l[:100])
如果我没有正确理解你的问题,你需要首先展平你的列表列表,然后创建它的块。这是一个使用 itertools module
中的 chain.from_iterable
的示例以及您用来创建区块的代码:
from itertools import chain
def chunks(elm, length):
for k in range(0, len(elm), length):
yield elm[k: k + length]
my_list = [['item{}'.format(j) for j in range(7)]] * 1000
flattened = list(chain.from_iterable(my_list))
chunks = list(chunks(flattened, 100))
print(len(chunks[10]))
输出:
100
经过一些耗时的研究,我开发了一个有效的解决方案。下面的解决方案循环遍历列表列表并提取 100 个列表。
# Verifies that the list data_to_insert isn't empty
if len(data_to_insert) > 0:
# Obtains the length of the data to insert.
# The length is the number of sublists
# contained in the main nestled list.
data_length = len(data_to_insert)
# A loop counter used in the
# data insert process.
i = 0
# The number of sublists to slice
# from the main nestled list in
# each loop.
n = 100
# This loop execute a set of statements
# as long as the condition below is true
while i < data_length:
# Increments the loop counter
if len(data_to_insert) < 100:
i += len(data_to_insert)
else:
i += 100
# Slices 100 sublists from the main nestled list.
sliced_data = data_to_insert[:n]
# Verifies that the list sliced_data isn't empty
if len(sliced_data) > 0:
# Removes 1000 sublists from the main nestled list.
data_to_insert = data_to_insert[n:]
##################################
do something with the sliced_data
##################################
# Clears the list used to store the
# sliced_data in the insertion loop.
sliced_data.clear()
gc.collect()
# Clears the list used to store the
# data elements inserted into the
# database.
data_to_insert.clear()
gc.collect()
我开发了第二种方法来完成我的 objective,它基于 Sufiyan Ghori 使用 random.
if len(my_nestled_list) > 0:
# Obtains the length of the data to insert.
# The length is the number of sublists
# contained in the main nestled list.
data_length = len(my_nestled_list))
# A loop counter used in the
# data insert process.
i = 0
# The number of sublists to slice
# from the main nestled list in
# each loop.
n = 100
# This loop execute a set of statements
# as long as the condition below is true
while i < data_length:
# Increments the loop counter
if len(my_nestled_list)) < 100:
i += len(my_nestled_list))
else:
i += 100
# Uses list comprehension to randomly select 100 lists
# from the nestled list.
random_sample_of_100 = [my_nestled_list)[i] for i in sorted(random.sample(range(len(my_nestled_list))), n))]
print (random_sample_of_100)