KeyError: ('1', 'occurred at index 0')
KeyError: ('1', 'occurred at index 0')
我真的是新手 python,我正在使用以下数据帧:
data1 = {'Store_ID':['1','1','1','1','2','2','2','3','3'],
'YearMonth':[201801,201802,201805,201904,201812,201902,201906,201904,201907],
'AVG_Rating':[5.0,4.5,4.0,3.5,3.0,4.5,4.0,2.5,4.0]}
df1 = pd.DataFrame(data1)
--------------------AVG_Rating
Store_ID AnoMes
1 201801 5.0
201802 4.5
201805 4.0
201904 3.5
2 201812 3.0
201902 4.5
201906 4.0
3 201904 2.5
201907 4.0
data2 = {'Client_ID':['1212','1234','1122','1230'],
'Store_ID':['1','1','2','3'],
'YearMonth':[201804,201906,201904,201906]}
------------Client_ID---YearMonth
Store_ID
1 1212 201804
1 1234 201906
2 1122 201904
3 1230 201906
我通过 Store_ID 列将索引设置为两个 DF。
我必须根据 YearMonth 列合并来自 DF1 的最新 AVG_Rating,这是客户在商店购买的年份月份。我的最终数据框必须是:
--------Client_ID----年月-----AVG_Rating
Store_ID
1 1212 201804 4.5(201802 评级)
为此,我尝试使用更多应用函数下面的函数,但出现错误:
def get_previous_loja_rating(row):
loja = df1[row['Loja_ID']]
lst = loja[loja['AnoMes']] < df2[row['AnoMes']]
return lst[-1]
df2['PREVIOUS_RATING_MEAN'] = df1['AnoMes'].apply(get_previous_loja_rating,axis=1)
KeyError: ('Loja_ID', 'occurred at index 1')
有人可以帮助我吗?
您似乎在尝试在代码中使用西班牙语键名(Loja_ID
、AnoMes
等),而您的数据使用英语。您将要将它们更改为 Client_ID
和 YearMonth
.
我将使用 YearMonth 作为列名而不是 AnoMes。您的代码功能失败的原因有多种。
据我了解,您希望添加一个 avg rating 列,其中包含相应商店的最近 yearmonth 的值。
df1
Store_ID YearMonth AVG_Rating
0 1 201801 5.0
1 1 201802 4.5
2 1 201805 4.0
3 1 201904 3.5
4 2 201812 3.0
df2
Client_ID Store_ID YearMonth
0 1212 1 201804
1 1234 1 201906
2 1122 2 201904
3 1230 3 201906
def get_previous_loja_rating(row):
loja = df1[df1['Store_ID']==row['Store_ID']]
lst = [i for i in loja['YearMonth'] if i <= row['YearMonth']] #list of all yearmonth values less than or equal to client's yearmonth
return df1[(df1['YearMonth']==max(lst))&(df1['Store_ID']==row['Store_ID'])]['AVG_Rating'].iloc[0] # avg rating of the most recent yearmonth
df2['AVG_Rating'] = df2.apply(get_previous_loja_rating,axis=1)
df2
Client_ID Store_ID YearMonth AVG_Rating
0 1212 1 201804 4.5
1 1234 1 201906 3.5
2 1122 2 201904 4.5
3 1230 3 201906 2.5
这会将最接近的年月平均评分输入到您的客户数据框中
我真的是新手 python,我正在使用以下数据帧:
data1 = {'Store_ID':['1','1','1','1','2','2','2','3','3'],
'YearMonth':[201801,201802,201805,201904,201812,201902,201906,201904,201907],
'AVG_Rating':[5.0,4.5,4.0,3.5,3.0,4.5,4.0,2.5,4.0]}
df1 = pd.DataFrame(data1)
--------------------AVG_Rating
Store_ID AnoMes
1 201801 5.0
201802 4.5
201805 4.0
201904 3.5
2 201812 3.0
201902 4.5
201906 4.0
3 201904 2.5
201907 4.0
data2 = {'Client_ID':['1212','1234','1122','1230'],
'Store_ID':['1','1','2','3'],
'YearMonth':[201804,201906,201904,201906]}
------------Client_ID---YearMonth
Store_ID
1 1212 201804
1 1234 201906
2 1122 201904
3 1230 201906
我通过 Store_ID 列将索引设置为两个 DF。
我必须根据 YearMonth 列合并来自 DF1 的最新 AVG_Rating,这是客户在商店购买的年份月份。我的最终数据框必须是:
--------Client_ID----年月-----AVG_Rating
Store_ID
1 1212 201804 4.5(201802 评级)
为此,我尝试使用更多应用函数下面的函数,但出现错误:
def get_previous_loja_rating(row):
loja = df1[row['Loja_ID']]
lst = loja[loja['AnoMes']] < df2[row['AnoMes']]
return lst[-1]
df2['PREVIOUS_RATING_MEAN'] = df1['AnoMes'].apply(get_previous_loja_rating,axis=1)
KeyError: ('Loja_ID', 'occurred at index 1')
有人可以帮助我吗?
您似乎在尝试在代码中使用西班牙语键名(Loja_ID
、AnoMes
等),而您的数据使用英语。您将要将它们更改为 Client_ID
和 YearMonth
.
我将使用 YearMonth 作为列名而不是 AnoMes。您的代码功能失败的原因有多种。 据我了解,您希望添加一个 avg rating 列,其中包含相应商店的最近 yearmonth 的值。
df1
Store_ID YearMonth AVG_Rating
0 1 201801 5.0
1 1 201802 4.5
2 1 201805 4.0
3 1 201904 3.5
4 2 201812 3.0
df2
Client_ID Store_ID YearMonth
0 1212 1 201804
1 1234 1 201906
2 1122 2 201904
3 1230 3 201906
def get_previous_loja_rating(row):
loja = df1[df1['Store_ID']==row['Store_ID']]
lst = [i for i in loja['YearMonth'] if i <= row['YearMonth']] #list of all yearmonth values less than or equal to client's yearmonth
return df1[(df1['YearMonth']==max(lst))&(df1['Store_ID']==row['Store_ID'])]['AVG_Rating'].iloc[0] # avg rating of the most recent yearmonth
df2['AVG_Rating'] = df2.apply(get_previous_loja_rating,axis=1)
df2
Client_ID Store_ID YearMonth AVG_Rating
0 1212 1 201804 4.5
1 1234 1 201906 3.5
2 1122 2 201904 4.5
3 1230 3 201906 2.5
这会将最接近的年月平均评分输入到您的客户数据框中