Python TypeError: 'numpy.int32' object is not iterable

Python TypeError: 'numpy.int32' object is not iterable

我正在尝试获取我的 k-means 结果数据帧的熵,我得到了错误:TypeError: 'numpy.int32' object is not iterable 我不明白为什么。

from collections import Counter 
def calcEntropy(x):
    p, lens = Counter(x), np.float(len(x))
    return -np.sum(count/lens*np.log2(count/lens) for count in p.values())
k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']]

然后我收到错误消息:

<ipython-input-26-d375ecf00330> in <module>()
----> 1 k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']]

<ipython-input-26-d375ecf00330> in <listcomp>(.0)
----> 1 k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']]

<ipython-input-23-f5508ea8782c> in calcEntropy(x)
      1 from collections import Counter
      2 def calcEntropy(x):
----> 3     p, lens = Counter(x), np.float(len(x))
      4     return -np.sum(count/lens*np.log2(count/lens) for count in p.values())

/Users/mpiercy/anaconda/lib/python3.6/collections/__init__.py in __init__(*args, **kwds)
    535             raise TypeError('expected at most 1 arguments, got %d' % len(args))
    536         super(Counter, self).__init__()
--> 537         self.update(*args, **kwds)
    538 
    539     def __missing__(self, key):

/Users/mpiercy/anaconda/lib/python3.6/collections/__init__.py in update(*args, **kwds)
    622                     super(Counter, self).update(iterable) # fast path when counter is empty
    623             else:
--> 624                 _count_elements(self, iterable)
    625         if kwds:
    626             self.update(kwds)

TypeError: 'numpy.int32' object is not iterable

k_means_sp.head()

      credit    debit   cluster
0   9.207673    8.198884    1
1   4.248495    8.202181    0
2   8.149668    7.735145    2
3   5.138677    7.859741    0
4   8.058163    7.918614    2

好的,这是第一次尝试。看起来您的数据框将聚簇索引存储在 'cluster' 列中。所以你需要做的是根据索引获取每个集群,然后将该集群传递给你的 calcEntropy 函数,比如

for i in xrange(len(k_means_sp['cluster'].unique())) # loop thru cluster indices:
    cluster = k_means_sp.ix[k_means_sp['cluster'] == i][['credit', 'debit']]
    entropy = calcEntropy(cluster)

第二行筛选出具有相同聚簇索引的行。这有帮助吗?