使用一列值为可变序号列创建计数器

Question

我目前有一个包含一些列的 pandas 数据框。我希望建立一个专栏，Sequential，它列出了在循环的那个部分记录的迭代。我目前正在使用 itertools.cycle 和固定次数的迭代 block_cycles 进行此操作，如下所示：

# Fill out Sequential Numbers
block_cycles = 330
lens = len(raw_data.index)
sequential = list(itertools.islice(itertools.cycle(range(1, block_cycles)),lens))
interim_output['Sequential'] = sequential

输出如下：

print(interim_output['Sequential'])

0    1
1    2
2    3
...
329  330
331  1
332  2
332  3

如果一个循环中的迭代次数相同，这就可以了。然而，经过调查，我发现并不是每个循环都包含相同数量的迭代。我还有另一列 CycleNumber，其中包含迭代所属的循环编号。它看起来像这样：

print(raw_data['CycleNumber'])

0           1
1           1
2           1
3           1
4           1

51790    4936
51791    4936
51792    4936
51793    4936
51794    4936

因此，例如，一个循环可能包含 330 次迭代，而另一个循环可能包含 333、331 等等 - 不能保证它们是相同的。循环数中的值递增。

我已经建立了每个循环包含的迭代量的字典，cycle_freq，它看起来像这样：


# Calculate the number of iterations each cycle contains
cycle_freq = {}
for item in cycle_number:
    if (item in cycle_freq):
        cycle_freq[item] += 1
    else:
        cycle_freq[item] = 1

print (cycle_freq)

{1: 330, 2: 332, 3: 331, 4: 332, 5: 332, 6: 333, 7: 333, 8: 330....
4933: 331, 4934: 334, 4935: 287, 4936: 24}

我怎样才能使用这个字典来替换常量变量 block_cycles，根据该循环中确切的迭代次数创建一个大的序列号列列表？到目前为止，这是我试图让它使用字典 cycle_freq 中包含的值的逻辑，但无济于事：

for i in cycle_freq:
    iteration = list(itertools.islice(itertools.cycle(range(1, cycle_freq[i])),lens))
    sequential.append(iteration)

我想要的输出如下所示：

如有任何帮助，我们将不胜感激！

Answer 1

我使用了一个解决方法并放弃了 itertools：

sequential = []
for _, cycles in cycle_freq.items():
    seq = [cycle for cycle in range(1, cycles + 1)]
    sequential.extend(seq)

interim_output['Sequential'] = sequential

使用一列值为可变序号列创建计数器

Using a column of values to create a counter for a variable sequential number column

python

dictionary

itertools

dataframe

pandas