如何从嵌套列表中分块 n 段

How to chunk n segments from a nestled list

我正在尝试从嵌套列表中分块 100 个列表。我查看了 Stack Overflow 上的多个示例,但仍然无法正常工作。

我的主列表名为 data_to_insert,它包含其他列表。我想从主嵌套列表中提取(块)100 个列表。

我该如何完成?

这是我当前的代码,无法按需运行。

def divide_chunks(l, n):
   for i in range(0, len(l), n):
      yield l[i:i + n]

n = 100
x = list(divide_chunks(data_to_insert, 100)) 

嵌套列表示例:

data_to_insert = [['item1','item2','item3','item4','item5','item6'],
 ['item1','item2','item3','item4','item5','item6'],
 ['item1','item2','item3','item4','item5','item6'],
 ['item1','item2','item3','item4','item5','item6'],
 ['item1','item2','item3','item4','item5','item6'],
 ...
 [thousands of others lists go here]]

期望的输出是另一个列表 (sliced_data),其中包含嵌套列表 (data_to_insert) 中的 100 个列表。

sliced_data = [['item1','item2','item3','item4','item5','item6'],
 ['item1','item2','item3','item4','item5','item6'], 
 ...
 [98 more lists go here]]

我需要遍历嵌套列表,data_to_insert直到它为空。

您可以使用给定列表中的 random 到 select 100 随机嵌套列表。

这将从原始列表中输出 3 随机嵌套列表,

import random

l = [[1,2], [3,4], [1,1], [2,3], [3,5], [0,0]]
print(random.sample(l, 3))


# output,
[[3, 4], [1, 2], [2, 3]]

如果您不想要列表输出,请将 print(random.sample(l, 3)) 替换为 print(*random.sample(l, 3))

# output,
[1, 2] [2, 3] [1, 1]

如果你只想先 100 嵌套列表然后做,

print(l[:100])

如果我没有正确理解你的问题,你需要首先展平你的列表列表,然后创建它的块。这是一个使用 itertools module 中的 chain.from_iterable 的示例以及您用来创建区块的代码:

from itertools import chain

def chunks(elm, length):
    for k in range(0, len(elm), length):
        yield elm[k: k + length]


my_list = [['item{}'.format(j) for j in range(7)]] * 1000
flattened = list(chain.from_iterable(my_list))

chunks = list(chunks(flattened, 100))

print(len(chunks[10]))

输出:

100

经过一些耗时的研究,我开发了一个有效的解决方案。下面的解决方案循环遍历列表列表并提取 100 个列表。

# Verifies that the list data_to_insert isn't empty
if len(data_to_insert) > 0:

  # Obtains the length of the data to insert.
  # The length is the number of sublists
  # contained in the main nestled list.
  data_length = len(data_to_insert)

  # A loop counter used in the
  # data insert process.
  i = 0

  # The number of sublists to slice
  # from the main nestled list in
  # each loop.
  n = 100

  # This loop execute a set of statements
  # as long as the condition below is true
  while i < data_length:

    # Increments the loop counter
    if len(data_to_insert) < 100:
      i += len(data_to_insert)
    else:
       i += 100

    # Slices 100 sublists from the main nestled list.
    sliced_data = data_to_insert[:n]

    # Verifies that the list sliced_data isn't empty
    if len(sliced_data) > 0:

      # Removes 1000 sublists from the main nestled list.
      data_to_insert = data_to_insert[n:]

      ##################################
      do something with the sliced_data
      ##################################

      # Clears the list used to store the
      # sliced_data in the insertion loop.
      sliced_data.clear()
      gc.collect()

   # Clears the list used to store the
   # data elements inserted into the
   # database.
   data_to_insert.clear()
   gc.collect()

我开发了第二种方法来完成我的 objective,它基于 Sufiyan Ghori 使用 random.

的建议
if len(my_nestled_list) > 0:

  # Obtains the length of the data to insert.
  # The length is the number of sublists
  # contained in the main nestled list.
  data_length = len(my_nestled_list))

  # A loop counter used in the
  # data insert process.
  i = 0

  # The number of sublists to slice
  # from the main nestled list in
  # each loop.
  n = 100

  # This loop execute a set of statements
  # as long as the condition below is true
  while i < data_length:

    # Increments the loop counter
    if len(my_nestled_list)) < 100:
      i += len(my_nestled_list))
    else:
      i += 100

    # Uses list comprehension to randomly select 100 lists 
    # from the nestled list.  
    random_sample_of_100 = [my_nestled_list)[i] for i in sorted(random.sample(range(len(my_nestled_list))), n))]

   print (random_sample_of_100)