Python: 统计不间断间隔的次数

Python: count number of uninterrupded intervals

考虑一个由 0 和 1 组成的数组 Y。例如:Y = (0,1,1,0)。我想统计 0s 和 1s 的不间断间隔数。在我们的示例中,n0 = 2 和 n1 = 1。我有一个执行所需操作的脚本。虽然它不是很优雅。有人知道更流畅或更 pythonic 的版本吗?

import pandas as pd
import numpy as np

# storage
counter = {}

# number of random draws
n = 10

# dataframe of random draw between 0 and 1
Y = pd.DataFrame(np.random.choice(2, n))

# where are the 0s and 1s
idx_0 = Y[Y[0] == 0].index
idx_1 = Y[Y[0] == 1].index

# count intervals of uninterrupted 0s
j = 0
for i in idx_0:
    if i+1 < n:
        if Y.loc[i+1, 0] == 1:
            j += 1
        else:
            continue

if Y.loc[n-1, 0] == 0:
    j += 1


counter['n_0'] = j

# count intervals of uninterrupted 1s
j = 0
for i in idx_1:
    if i+1 < n:
        if Y.loc[i+1, 0] == 0:
            j += 1
        else:
            continue

if Y.loc[n-1, 0] == 1:
    j += 1

counter['n_1'] = j
numbers = [0, 1, 1, 0]
def runs(x, numbers):
  number_string = ''.join([str(n) for n in numbers])
  return len([r for r in number_string.split('1' if x == 0 else '0') if r])

print(runs(0, numbers))
print(runs(1, numbers))

使用数据帧更新:

import pandas as pd
import numpy as np

# storage
counter = {}

# number of random draws
n = 10

# dataframe of random draw between 0 and 1
Y = pd.DataFrame(np.random.choice(2, n))
print([v[0] for v in Y.values.tolist()])

def runs(x, numbers):
  number_string = ''.join([str(n) for n in numbers])
  return len([len(r) for r in number_string.split('1' if x == 0 else '0') if r])

values = [v[0] for v in Y.values.tolist()]
print(values)
print('Runs of 0: {}'.format(runs(0, values)))
print('Runs of 1: {}'.format(runs(1, values))

利用 pandas 方法的更简洁的解决方案:

counter = Y[0][Y[0].diff() != 0].value_counts()
  • Y[0].diff()统计连续元素的差值
  • diff != 0 标记值变化的索引
  • Y[idx].value_counts()统计每个值出现的频率

10 个随机元素 [0, 1, 1, 0, 1, 1, 1, 1, 1, 1] 的示例结果:

1    2
0    2
Name: 0, dtype: int64

如果您坚持使用 'n_0' 和 'n_1' 键,您可以将它们重命名为

counter = counter.rename(index={i: f'n_{i}' for i in range(2)})

您也可以使用 dict(counter) 将其转换为字典,即使 pandas 对象具有与 counter[key] 相同的功能,为您提供相应的值。