Numpy TypeError: an integer is required
Numpy TypeError: an integer is required
这可能是一个非常私人的问题,但我不知道该问谁我希望有人能提供帮助,不要跳过我,谢谢!。我已经使用 Anaconda 和 Jupyter notebook 安装了 python。我有 2 个 csv 数据文件。
products.head()
ID_FUPID FUPID
0 1 674563
1 2 674597
2 3 674606
3 4 694776
4 5 694788
产品包含产品ID和产品编号。
ratings.head()
ID_CUSTOMER ID_FUPID RATING
0 1 216 1
1 2 390 1
2 3 851 5
3 4 5897 1
4 5 9341 1
Ratings containt id of customer, productID 和客户给产品的评级。
我将 table 创建为:
M = ratings.pivot_table(index=['ID_CUSTOMER'],columns=['ID_FUPID'],values='RATING')
在矩阵中正确显示数据,其中 productID= 列,customerID 为行。
我想计算产品之间的 pearson 比对,所以这里是 pearson 函数:
def pearson(s1, s2):
import numpy as np
"""take two pd.series objects and return a pearson correlation"""
s1_c = s1 - s1.mean()
s2_c = s2 - s2.mean()
return np.sum(s1_c * s2_c) / np.sqrt(np.sum(s1_c ** 2) * np.sum(s2_c ** 2))
当我尝试计算 pearson(M['17'], M['21']) 时,我遇到了以下错误:
TypeError Traceback (most recent call last)
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
TypeError: an integer is required
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2441 try:
-> 2442 return self._engine.get_loc(key)
2443 except KeyError:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
KeyError: '17'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
TypeError: an integer is required
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-277-d4ead225b6ab> in <module>()
----> 1 pearson(M['17'], M['21'])
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
1962 return self._getitem_multilevel(key)
1963 else:
-> 1964 return self._getitem_column(key)
1965
1966 def _getitem_column(self, key):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _getitem_column(self, key)
1969 # get column
1970 if self.columns.is_unique:
-> 1971 return self._get_item_cache(key)
1972
1973 # duplicate columns & possible reduce dimensionality
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in _get_item_cache(self, item)
1643 res = cache.get(item)
1644 if res is None:
-> 1645 values = self._data.get(item)
1646 res = self._box_item_values(item, values)
1647 cache[item] = res
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals.py in get(self, item, fastpath)
3588
3589 if not isnull(item):
-> 3590 loc = self.items.get_loc(item)
3591 else:
3592 indexer = np.arange(len(self.items))[isnull(self.items)]
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2442 return self._engine.get_loc(key)
2443 except KeyError:
-> 2444 return self._engine.get_loc(self._maybe_cast_indexer(key))
2445
2446 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
KeyError: '17'
非常感谢任何帮助!太感谢了。
错误消息中有两处包含以下行:
KeyError: '17'
这表示 M
中没有键 '17'
。这可能是因为您的索引是整数。但是,您当前正在使用字符串访问 DataFrame M
。调用 pearson
的代码可能如下所示:
pearson(M[17], M[21])
这可能是一个非常私人的问题,但我不知道该问谁我希望有人能提供帮助,不要跳过我,谢谢!。我已经使用 Anaconda 和 Jupyter notebook 安装了 python。我有 2 个 csv 数据文件。
products.head()
ID_FUPID FUPID
0 1 674563
1 2 674597
2 3 674606
3 4 694776
4 5 694788
产品包含产品ID和产品编号。
ratings.head()
ID_CUSTOMER ID_FUPID RATING
0 1 216 1
1 2 390 1
2 3 851 5
3 4 5897 1
4 5 9341 1
Ratings containt id of customer, productID 和客户给产品的评级。 我将 table 创建为:
M = ratings.pivot_table(index=['ID_CUSTOMER'],columns=['ID_FUPID'],values='RATING')
在矩阵中正确显示数据,其中 productID= 列,customerID 为行。
我想计算产品之间的 pearson 比对,所以这里是 pearson 函数:
def pearson(s1, s2):
import numpy as np
"""take two pd.series objects and return a pearson correlation"""
s1_c = s1 - s1.mean()
s2_c = s2 - s2.mean()
return np.sum(s1_c * s2_c) / np.sqrt(np.sum(s1_c ** 2) * np.sum(s2_c ** 2))
当我尝试计算 pearson(M['17'], M['21']) 时,我遇到了以下错误:
TypeError Traceback (most recent call last)
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
TypeError: an integer is required
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2441 try:
-> 2442 return self._engine.get_loc(key)
2443 except KeyError:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
KeyError: '17'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
TypeError: an integer is required
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-277-d4ead225b6ab> in <module>()
----> 1 pearson(M['17'], M['21'])
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
1962 return self._getitem_multilevel(key)
1963 else:
-> 1964 return self._getitem_column(key)
1965
1966 def _getitem_column(self, key):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _getitem_column(self, key)
1969 # get column
1970 if self.columns.is_unique:
-> 1971 return self._get_item_cache(key)
1972
1973 # duplicate columns & possible reduce dimensionality
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in _get_item_cache(self, item)
1643 res = cache.get(item)
1644 if res is None:
-> 1645 values = self._data.get(item)
1646 res = self._box_item_values(item, values)
1647 cache[item] = res
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals.py in get(self, item, fastpath)
3588
3589 if not isnull(item):
-> 3590 loc = self.items.get_loc(item)
3591 else:
3592 indexer = np.arange(len(self.items))[isnull(self.items)]
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2442 return self._engine.get_loc(key)
2443 except KeyError:
-> 2444 return self._engine.get_loc(self._maybe_cast_indexer(key))
2445
2446 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
KeyError: '17'
非常感谢任何帮助!太感谢了。
错误消息中有两处包含以下行:
KeyError: '17'
这表示 M
中没有键 '17'
。这可能是因为您的索引是整数。但是,您当前正在使用字符串访问 DataFrame M
。调用 pearson
的代码可能如下所示:
pearson(M[17], M[21])