Numpy TypeError: an integer is required

Numpy TypeError: an integer is required

这可能是一个非常私人的问题,但我不知道该问谁我希望有人能提供帮助,不要跳过我,谢谢!。我已经使用 Anaconda 和 Jupyter notebook 安装了 python。我有 2 个 csv 数据文件。

products.head()
     ID_FUPID   FUPID
    0   1   674563
    1   2   674597
    2   3   674606
    3   4   694776
    4   5   694788

产品包含产品ID和产品编号。

ratings.head()
 ID_CUSTOMER    ID_FUPID    RATING
0   1   216     1
1   2   390     1
2   3   851     5
3   4   5897    1
4   5   9341    1

Ratings containt id of customer, productID 和客户给产品的评级。 我将 table 创建为:

M = ratings.pivot_table(index=['ID_CUSTOMER'],columns=['ID_FUPID'],values='RATING')

在矩阵中正确显示数据,其中 productID= 列,customerID 为行。

我想计算产品之间的 pearson 比对,所以这里是 pearson 函数:

def pearson(s1, s2):
    import numpy as np 
    """take two pd.series objects and return a pearson correlation"""
    s1_c = s1 - s1.mean()
    s2_c = s2 - s2.mean()
    return np.sum(s1_c * s2_c) / np.sqrt(np.sum(s1_c ** 2) * np.sum(s2_c ** 2))

当我尝试计算 pearson(M['17'], M['21']) 时,我遇到了以下错误:

TypeError                                 Traceback (most recent call last)
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2441             try:
-> 2442                 return self._engine.get_loc(key)
   2443             except KeyError:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

KeyError: '17'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-277-d4ead225b6ab> in <module>()
----> 1 pearson(M['17'], M['21'])

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   1962             return self._getitem_multilevel(key)
   1963         else:
-> 1964             return self._getitem_column(key)
   1965 
   1966     def _getitem_column(self, key):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _getitem_column(self, key)
   1969         # get column
   1970         if self.columns.is_unique:
-> 1971             return self._get_item_cache(key)
   1972 
   1973         # duplicate columns & possible reduce dimensionality

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in _get_item_cache(self, item)
   1643         res = cache.get(item)
   1644         if res is None:
-> 1645             values = self._data.get(item)
   1646             res = self._box_item_values(item, values)
   1647             cache[item] = res

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals.py in get(self, item, fastpath)
   3588 
   3589             if not isnull(item):
-> 3590                 loc = self.items.get_loc(item)
   3591             else:
   3592                 indexer = np.arange(len(self.items))[isnull(self.items)]

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2442                 return self._engine.get_loc(key)
   2443             except KeyError:
-> 2444                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2445 
   2446         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

KeyError: '17'

非常感谢任何帮助!太感谢了。

错误消息中有两处包含以下行:

KeyError: '17'

这表示 M 中没有键 '17'。这可能是因为您的索引是整数。但是,您当前正在使用字符串访问 DataFrame M。调用 pearson 的代码可能如下所示:

pearson(M[17], M[21])