列表中数字的修改累积和

Modified cumulative sum of numbers in a list

我想根据列表中数字的累计和创建新列表。输入是理想的 - 可以拆分为子集,每个子​​集的总和相等。子集的长度不相等。输入子集数。

输出的每个子集代表增量整数 [0,1,2,3,...],它们替换原始输入。整数的数量是子集的数量。

示例:

number of subsets = 2   

input = [1, 4, 5]
#cumsum = [1, 5, 10]
subsets = [1,5], [10]
output-subsets = [0,0], [1]
output = [0, 0, 1]

示例 1:

number of subsets = 4

input = [1, 2, 3, 4, 2, 5, 1, 6]
#cumsum = [1, 3, 6, 10, 12, 17, 18, 24]
subsets = [1,3,6], [10, 12],[17, 18], [24]
output-subsets = [0, 0, 0], [1, 1], [2, 2], [3]
output = [0, 0, 0, 1, 1, 2, 2, 3]

number of subsets = 2

input = [1, 2, 3, 4, 2, 5, 1, 6]
#cumsum = [1, 3, 6, 10, 12, 17, 18, 24]
subsets = [1, 3, 6, 10, 12],[17, 18, 24]
output-subsets = [0, 0, 0, 0, 0], [1, 1, 1]
output = [0, 0, 0, 0, 0, 1, 1, 1]

我尝试修改SO question:

def changelist(lis, t):
    total = 0

    s = sum(lis)
    subset = s/t

    for x in lis:
        total += x
        i= 1
        if(total <= subset):
            i = 0
        yield i


#changelist([input array], number of subset)    
print list(changelist([1, 2, 3, 4, 2, 5, 1, 6], 4))     

但只有第一个子集是正确的:

output = [0, 0, 0, 1, 1, 1, 1, 1]

我认为numpy.array_split有问题strange behaviour of numpy array_split

我真的很想得到任何形式的解释或帮助。

这应该可以解决您的问题:

def changelist (l, t):
  subset = sum(l) / t
  current, total = 0, 0
  for x in l:
    total += x
    if total > subset:
      current, total = current + 1, x
    yield current

示例:

>>> list(changelist([1, 4, 5], 2))
[0, 0, 1]
>>> list(changelist([1, 2, 3, 4, 2, 5, 1, 6], 4))
[0, 0, 0, 1, 1, 2, 2, 3]
>>> list(changelist([1, 2, 3, 4, 2, 5, 1, 6], 2))
[0, 0, 0, 0, 0, 1, 1, 1]

它是如何工作的?

  • current存储当前子集的"id",total当前子集的和
  • 对于初始列表 l 中的每个元素 x,如果此 total 大于预期总和,则将其值添加到当前 total每个子集(subset 在我的代码中),然后你知道你在下一个子集(current = current + 1)并且你 "reset" 当前子集到 actuel 元素的总数(total = x).

您可以在将 input 转换为向量化解决方案的数组后在此处使用 NumPy,假设 N 作为子集的数量,如此处所列 -

def modified_cumsum(input,N):
    A = np.asarray(input).cumsum()
    return np.append(False,np.in1d(A,(1+np.arange(N))*A[-1]/N))[:-1].cumsum()

样品运行 -

In [31]: N = 2  #number of subsets
    ...: input = [1, 4, 5]
    ...: 

In [32]: modified_cumsum(input,N)
Out[32]: array([0, 0, 1])

In [33]: N = 4  #number of subsets
    ...: input = [1, 2, 3, 4, 2, 5, 1, 6]
    ...: 

In [34]: modified_cumsum(input,N)
Out[34]: array([0, 0, 0, 1, 1, 2, 2, 3])

In [35]: N = 2  #number of subsets
    ...: input = [1, 2, 3, 4, 2, 5, 1, 6]
    ...: 

In [36]: modified_cumsum(input,N)
Out[36]: array([0, 0, 0, 0, 0, 1, 1, 1])