有条件地将行插入 pandas DataFrame
Conditionally insert rows into pandas DataFrame
这是我将 csv 拉入 pandas 数据框的问题,如下所示:
Identity Date value1 value2 Random
Apple 1/1/2005 10 10 Orange
Apple 12/1/2005 1 1 Orange
然后我需要调用 Identity Apple,找到它的最小和最大日期并插入行,即月份,以便在两点之间进行插值,这样最终结果就变成了
Identity Date value1 value2 Random
Apple 1/1/2005 10 10 Orange
Apple 2/1/2005 0 0 Orange
Apple 3/1/2005 0 0 Orange
. . . . .
. . . . .
. . . . .
Apple 12/1/2005 1 1 Orange
问题是,尽管我可以遍历身份列表并关联所有行,但我似乎无法找到一种方法来插入额外的行,尤其是在没有讨厌的 for 循环的情况下。本质上我需要弥合日期差距并用零填充关联的身份值。
list = ['Apple','Banana','Orange']
for i in range(0,len(list))
data.loc[data['Identity'].isin(list[i])]
编辑:
下面的工作代码:
import pandas as pd
df = pd.DataFrame([['Apple',pd.to_datetime('1/1/2005'),10,10,'Orange'], ['Orange', pd.to_datetime('8/1/2005'),1, 1 ,'Apple'],['Apple', pd.to_datetime('12/1/2005'),1, 1 ,'Orange']])
df.columns = ['Identity','Date', 'value1' , 'value2','Random']
full_df = pd.DataFrame()
dummydata = []
indentity = ['Apple','Orange']
random = ['Orange','Apple']
years = ['2005','2005']
for i in range(0,2):
full_df = pd.DataFrame()
full_df['Date'] = [pd.to_datetime(str(x)+'/1/'+str(years[i])) for x in range(1,13)]
full_df['Identity'] = indentity[i]
full_df['Random'] = random[i]
dummydata.append(full_df)
full_df = pd.concat(dummydata)
result = full_df.merge(df,how='left').fillna(0)
#print(dummydata)
#print(full_df)
print(result )
我的建议是创建完整的理论 DF,与数据和 fillna 合并:
import pandas as pd
df = pd.DataFrame([['Apple',pd.to_datetime('1/1/2005'),10,10,'Orange'],['Apple', pd.to_datetime('12/1/2005'),1, 1 ,'Orange']])
df.columns = ['Identity','Date', 'value1' , 'value2','Random']
full_df = pd.DataFrame()
full_df['Date'] = [pd.to_datetime(str(x)+'/1/2005') for x in range(1,13)]
full_df['Identity'] = 'Apple'
result = full_df.merge(df,how='left').fillna(0)
result
这适用于一个身份和年份,遍历年份和身份,将所有创建的 DF 附加到列表中,然后 pd.concat(list)
这是我将 csv 拉入 pandas 数据框的问题,如下所示:
Identity Date value1 value2 Random
Apple 1/1/2005 10 10 Orange
Apple 12/1/2005 1 1 Orange
然后我需要调用 Identity Apple,找到它的最小和最大日期并插入行,即月份,以便在两点之间进行插值,这样最终结果就变成了
Identity Date value1 value2 Random
Apple 1/1/2005 10 10 Orange
Apple 2/1/2005 0 0 Orange
Apple 3/1/2005 0 0 Orange
. . . . .
. . . . .
. . . . .
Apple 12/1/2005 1 1 Orange
问题是,尽管我可以遍历身份列表并关联所有行,但我似乎无法找到一种方法来插入额外的行,尤其是在没有讨厌的 for 循环的情况下。本质上我需要弥合日期差距并用零填充关联的身份值。
list = ['Apple','Banana','Orange']
for i in range(0,len(list))
data.loc[data['Identity'].isin(list[i])]
编辑:
下面的工作代码:
import pandas as pd
df = pd.DataFrame([['Apple',pd.to_datetime('1/1/2005'),10,10,'Orange'], ['Orange', pd.to_datetime('8/1/2005'),1, 1 ,'Apple'],['Apple', pd.to_datetime('12/1/2005'),1, 1 ,'Orange']])
df.columns = ['Identity','Date', 'value1' , 'value2','Random']
full_df = pd.DataFrame()
dummydata = []
indentity = ['Apple','Orange']
random = ['Orange','Apple']
years = ['2005','2005']
for i in range(0,2):
full_df = pd.DataFrame()
full_df['Date'] = [pd.to_datetime(str(x)+'/1/'+str(years[i])) for x in range(1,13)]
full_df['Identity'] = indentity[i]
full_df['Random'] = random[i]
dummydata.append(full_df)
full_df = pd.concat(dummydata)
result = full_df.merge(df,how='left').fillna(0)
#print(dummydata)
#print(full_df)
print(result )
我的建议是创建完整的理论 DF,与数据和 fillna 合并:
import pandas as pd
df = pd.DataFrame([['Apple',pd.to_datetime('1/1/2005'),10,10,'Orange'],['Apple', pd.to_datetime('12/1/2005'),1, 1 ,'Orange']])
df.columns = ['Identity','Date', 'value1' , 'value2','Random']
full_df = pd.DataFrame()
full_df['Date'] = [pd.to_datetime(str(x)+'/1/2005') for x in range(1,13)]
full_df['Identity'] = 'Apple'
result = full_df.merge(df,how='left').fillna(0)
result
这适用于一个身份和年份,遍历年份和身份,将所有创建的 DF 附加到列表中,然后 pd.concat(list)