KeyError: 'labels [data] not contained in axis'
KeyError: 'labels [data] not contained in axis'
我尝试了几种不同的方法来向现有的 Pandas 数据框添加一行。例如,我尝试了解决方案 here。但是我无法纠正这个问题。我已经恢复到原来的代码,希望有人能在这里帮助我。
这是我的代码:
print('XDF Created, Starting Bucket Separation...')
XDFDFdrop = pd.DataFrame.duplicated(XDFDF,subset='LastSurveyMachineID')
index_of_unique = XDFDF.drop_duplicates(subset='LastSurveyMachineID')
for index,row in zip(XDFDFdrop,XDFDF.itertuples()):
if index:
goodBucket.append(row)
else:
badBucket.append(row)
goodBucketDF = pd.DataFrame(goodBucket)
badBucketDF = pd.DataFrame(badBucket)
print('Bucket Separation Complete, EmailPrefix to F+L Test Starting...')
for emp , fname , lname , row1 in zip(goodBucketDF['EmailPrefix'] , goodBucketDF['Fname'] , goodBucketDF['Lname'] , goodBucketDF.itertuples()):
for emp2 , row2 in zip(goodBucketDF['EmailPrefix'] , goodBucketDF.itertuples()):
if columns != rows:
temp = fuzz.token_sort_ratio((fname+lname),emp)
temp2 = fuzz.token_sort_ratio((fname+lname),emp2)
if abs(temp - temp2) < 10:
badBucketDF.append(list(row2))
goodBucketDF = goodBucketDF.drop(row2)
removed = True
rows += 1
if removed:
badBucketDF.append(list(row2))
goodBucketDF = goodBucketDF.drop(row2)
removed = False
columns += 1
请注意:XDFDF 是一个相对较大的数据集,它是使用 pandas 构建的,并且是从数据库中提取的(它不应该影响您看到的代码,我想我会公开该信息)。
这是我的错误:
Traceback (most recent call last):
File "/Users/john/PycharmProjects/Greatness/venv/Recipes.py", line 122, in <module>
goodBucketDF = goodBucketDF.drop([rows])
File "/Users/john/PycharmProjects/Greatness/venv/lib/python3.6/site-packages/pandas/core/frame.py", line 3694, in drop
errors=errors)
File "/Users/john/PycharmProjects/Greatness/venv/lib/python3.6/site-packages/pandas/core/generic.py", line 3108, in drop
obj = obj._drop_axis(labels, axis, level=level, errors=errors)
File "/Users/john/PycharmProjects/Greatness/venv/lib/python3.6/site-packages/pandas/core/generic.py", line 3140, in _drop_axis
new_axis = axis.drop(labels, errors=errors)
File "/Users/john/PycharmProjects/Greatness/venv/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 4387, in drop
'labels %s not contained in axis' % labels[mask])
KeyError: 'labels [(15, '1397659289', 'joshi.penguin@gmail.com', 'jim', 'smith', '1994-05-04', 'joshi.penguin', 'CF032611-8A86-4688-9715-E1278E75D046')] not contained in axis'
进程已完成,退出代码为 1
我想知道是否有人对此错误有解决方案,以便:我可以从一个 DataFrame 添加一行,将其放在另一个 DataFrame 中(不需要按顺序排列,我不关心索引是否重复)。一旦它在新的 Dataframe 中,我想将它从旧的 DataFrame 中删除。
我当前的问题是从旧 Dataframe 中删除该行。任何帮助,将不胜感激。
如果您对代码有任何疑问,请告诉我,我会尽快回复。谢谢您的帮助。
编辑 1
下面我包含了第 1 行的打印输出。希望这也会有所帮助。
Pandas(Index=1, _1=2, entity_id='1180722688', email='assassin_penguin@live.com', Fname='jim', Lname='smith', Birthdate='1990-09-14', EmailPrefix='assassin_penguin', LastSurveyMachineID=None)
鉴于 XDFDF 是 pandas.DataFrame
,下面的工作不应该吗?
XDFDFdrop = pd.DataFrame.duplicated(XDFDF,subset='LastSurveyMachineID')
goodBucket = XDFDF.loc[~XDFDFdrop] #the ~ negates a boolean array
badBucket = XDFDF.loc[XDFDFdrop]
编辑:
更新的错误来自于您将整行而不是索引传递给函数 pandas.DataFrame.drop
。
我尝试了几种不同的方法来向现有的 Pandas 数据框添加一行。例如,我尝试了解决方案 here。但是我无法纠正这个问题。我已经恢复到原来的代码,希望有人能在这里帮助我。
这是我的代码:
print('XDF Created, Starting Bucket Separation...')
XDFDFdrop = pd.DataFrame.duplicated(XDFDF,subset='LastSurveyMachineID')
index_of_unique = XDFDF.drop_duplicates(subset='LastSurveyMachineID')
for index,row in zip(XDFDFdrop,XDFDF.itertuples()):
if index:
goodBucket.append(row)
else:
badBucket.append(row)
goodBucketDF = pd.DataFrame(goodBucket)
badBucketDF = pd.DataFrame(badBucket)
print('Bucket Separation Complete, EmailPrefix to F+L Test Starting...')
for emp , fname , lname , row1 in zip(goodBucketDF['EmailPrefix'] , goodBucketDF['Fname'] , goodBucketDF['Lname'] , goodBucketDF.itertuples()):
for emp2 , row2 in zip(goodBucketDF['EmailPrefix'] , goodBucketDF.itertuples()):
if columns != rows:
temp = fuzz.token_sort_ratio((fname+lname),emp)
temp2 = fuzz.token_sort_ratio((fname+lname),emp2)
if abs(temp - temp2) < 10:
badBucketDF.append(list(row2))
goodBucketDF = goodBucketDF.drop(row2)
removed = True
rows += 1
if removed:
badBucketDF.append(list(row2))
goodBucketDF = goodBucketDF.drop(row2)
removed = False
columns += 1
请注意:XDFDF 是一个相对较大的数据集,它是使用 pandas 构建的,并且是从数据库中提取的(它不应该影响您看到的代码,我想我会公开该信息)。
这是我的错误:
Traceback (most recent call last):
File "/Users/john/PycharmProjects/Greatness/venv/Recipes.py", line 122, in <module>
goodBucketDF = goodBucketDF.drop([rows])
File "/Users/john/PycharmProjects/Greatness/venv/lib/python3.6/site-packages/pandas/core/frame.py", line 3694, in drop
errors=errors)
File "/Users/john/PycharmProjects/Greatness/venv/lib/python3.6/site-packages/pandas/core/generic.py", line 3108, in drop
obj = obj._drop_axis(labels, axis, level=level, errors=errors)
File "/Users/john/PycharmProjects/Greatness/venv/lib/python3.6/site-packages/pandas/core/generic.py", line 3140, in _drop_axis
new_axis = axis.drop(labels, errors=errors)
File "/Users/john/PycharmProjects/Greatness/venv/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 4387, in drop
'labels %s not contained in axis' % labels[mask])
KeyError: 'labels [(15, '1397659289', 'joshi.penguin@gmail.com', 'jim', 'smith', '1994-05-04', 'joshi.penguin', 'CF032611-8A86-4688-9715-E1278E75D046')] not contained in axis'
进程已完成,退出代码为 1
我想知道是否有人对此错误有解决方案,以便:我可以从一个 DataFrame 添加一行,将其放在另一个 DataFrame 中(不需要按顺序排列,我不关心索引是否重复)。一旦它在新的 Dataframe 中,我想将它从旧的 DataFrame 中删除。
我当前的问题是从旧 Dataframe 中删除该行。任何帮助,将不胜感激。
如果您对代码有任何疑问,请告诉我,我会尽快回复。谢谢您的帮助。
编辑 1
下面我包含了第 1 行的打印输出。希望这也会有所帮助。
Pandas(Index=1, _1=2, entity_id='1180722688', email='assassin_penguin@live.com', Fname='jim', Lname='smith', Birthdate='1990-09-14', EmailPrefix='assassin_penguin', LastSurveyMachineID=None)
鉴于 XDFDF 是 pandas.DataFrame
,下面的工作不应该吗?
XDFDFdrop = pd.DataFrame.duplicated(XDFDF,subset='LastSurveyMachineID')
goodBucket = XDFDF.loc[~XDFDFdrop] #the ~ negates a boolean array
badBucket = XDFDF.loc[XDFDFdrop]
编辑:
更新的错误来自于您将整行而不是索引传递给函数 pandas.DataFrame.drop
。