将 lambda 函数应用于 pandas 数据帧时出现键盘错误
Keyerror occuring while applying lambda function to pandas dataframe
我正在 pandas 数据帧上应用 K 均值聚类。聚类分配函数如下:
def assign_to_cluster(row):
lowest_distance = -1
closest_cluster = -1
for cluster_id, centroid in centroids_dict.items():
df_row = [row['PPG'],row['ATR']]
euclidean_distance = calculate_distance(centroids, df_row)
if lowest_distance == -1:
lowest_distance = euclidean_distance
closest_cluster = cluster_id
elif euclidean_distance < lowest_distance:
lowest_distance = euclidean_distance
closest_cluster = cluster_id
return closest_cluster
point_guards['CLUSTER'] = point_guards.apply(lambda row: assign_to_cluster(row), axis=1)
但是我在使用 lambda 函数时遇到以下错误:
1945 return self._engine.get_loc(key)
1946 except KeyError:
-> 1947 return self._engine.get_loc(self._maybe_cast_indexer(key))
1948
1949 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)()
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)()
pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)()
pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)()
KeyError: (0, 'occurred at index 0')
谁能解释一下错误的原因以及我该如何解决?如果您需要更多信息,请回复此 post。
并为格式道歉。这是我第一次在 Whosebug 提问。
原来我犯了一个简单的语法错误。而不是在调用函数 'calculate_distance':
时使用字典 'centroid_dict.items()' 的 'centroid' 部分
for cluster_id, centroid in centroids_dict.items():
df_row = [row['PPG'],row['ATR']]
euclidean_distance = calculate_distance(centroid, df_row)
....
我用 'centroids' 代替:
for cluster_id, centroid in centroids_dict.items():
df_row = [row['PPG'],row['ATR']]
euclidean_distance = calculate_distance(centroids, df_row)
不过现在已经解决了
我正在 pandas 数据帧上应用 K 均值聚类。聚类分配函数如下:
def assign_to_cluster(row):
lowest_distance = -1
closest_cluster = -1
for cluster_id, centroid in centroids_dict.items():
df_row = [row['PPG'],row['ATR']]
euclidean_distance = calculate_distance(centroids, df_row)
if lowest_distance == -1:
lowest_distance = euclidean_distance
closest_cluster = cluster_id
elif euclidean_distance < lowest_distance:
lowest_distance = euclidean_distance
closest_cluster = cluster_id
return closest_cluster
point_guards['CLUSTER'] = point_guards.apply(lambda row: assign_to_cluster(row), axis=1)
但是我在使用 lambda 函数时遇到以下错误:
1945 return self._engine.get_loc(key)
1946 except KeyError:
-> 1947 return self._engine.get_loc(self._maybe_cast_indexer(key))
1948
1949 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)()
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)()
pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)()
pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)()
KeyError: (0, 'occurred at index 0')
谁能解释一下错误的原因以及我该如何解决?如果您需要更多信息,请回复此 post。 并为格式道歉。这是我第一次在 Whosebug 提问。
原来我犯了一个简单的语法错误。而不是在调用函数 'calculate_distance':
时使用字典 'centroid_dict.items()' 的 'centroid' 部分for cluster_id, centroid in centroids_dict.items():
df_row = [row['PPG'],row['ATR']]
euclidean_distance = calculate_distance(centroid, df_row)
....
我用 'centroids' 代替:
for cluster_id, centroid in centroids_dict.items():
df_row = [row['PPG'],row['ATR']]
euclidean_distance = calculate_distance(centroids, df_row)
不过现在已经解决了