将数据映射到 Python pandas 中另一年的同一工作日
Mapping data to the same weekday in another year in Python pandas
我有一个 pandas 全年用电量数据的数据框,但想将 table 更新到另一年。我希望数据值落在与以前相同的工作日。
我有:
Date 00:00 ... WeekDay requiredDate requiredWeekDay
25/11/2018 20 Sunday 25/11/2018 Sunday
26/11/2018 30 Monday 26/11/2018 Monday
27/11/2018 25 Tuesday 27/11/2018 Tuesday
28/11/2018 35 Wednesday 28/11/2018 Wednesday
29/11/2018 40 Thursday 29/11/2018 Thursday
30/11/2018 15 Friday 30/11/2018 Friday
01/12/2017 65 Sunday 01/12/2018 Saturday
02/12/2017 34 Monday 02/12/2018 Sunday
03/12/2017 81 Tuesday 03/12/2018 Monday
04/12/2017 62 Wednesday 04/12/2018 Tuesday
...
我想要什么:
Date 00:00 ... WeekDay
25/11/2018 20 Sunday
26/11/2018 30 Monday
27/11/2018 25 Tuesday
28/11/2018 35 Wednesday
29/11/2018 40 Thursday
30/11/2018 15 Friday
01/12/2018 Saturday
02/12/2018 65 Sunday
03/12/2018 34 Monday
04/12/2018 81 Tuesday
...
我尝试过的:
df['Day'] = df['Date'].dt.day
df['Month'] = df['Date'].dt.month
df['Year'] = df['Date'].dt.year
requiredYear = str(df['Year'].median()).replace(".0","")
df = df.sort_values(by = ['Month', 'Day']).reset_index()
df['RemappedDate']= np.nan
for index, row in df.iterrows():
if row['Weekday'] != row['requiredWeekday']:
while row[row['Day']]<31:
row['Day'] = row['Day']-1
row['RemappedDate'] = pd.to_datetime(str(row['Month'])+"/"+
str(row['Day'])+"/"+requiredYear)
else:
print("Already equal")
df['Date'] = df['RemappedDate']
df['Weekday'] = df['requiredWeekday']
可能离得不远,如果没有,我们深表歉意。我是初学者。
IIUC,你可以在没有所需日期列的帮助下增加年份:
print(df)
Date something WeekDay
0 2018-11-25 20 Sunday
1 2018-11-26 30 Monday
2 2018-11-27 25 Tuesday
3 2018-11-28 35 Wednesday
4 2018-11-29 40 Thursday
5 2018-11-30 15 Friday
6 2017-12-01 65 Sunday
7 2017-12-02 34 Monday
8 2017-12-03 81 Tuesday
9 2017-12-04 62 Wednesday
df['new_Date']=df['Date'].mask(df['Date'].dt.year == 2017, df['Date'] + pd.to_timedelta(1, unit='y') + pd.to_timedelta(12, unit='h'))
df['required_date'] = df.new_Date.dt.date
df['new_day']=df.new_Date.dt.day_name()
df['new_value']=np.where(df.WeekDay==df.new_day,df.something,df.new_day.map(dict(zip(df.loc[df.WeekDay!=df.new_day,'WeekDay'],df.loc[df.WeekDay!=df.new_day,'something']))))
print(df)
Date something WeekDay new_Date required_date \
0 2018-11-25 20 Sunday 2018-11-25 00:00:00 2018-11-25
1 2018-11-26 30 Monday 2018-11-26 00:00:00 2018-11-26
2 2018-11-27 25 Tuesday 2018-11-27 00:00:00 2018-11-27
3 2018-11-28 35 Wednesday 2018-11-28 00:00:00 2018-11-28
4 2018-11-29 40 Thursday 2018-11-29 00:00:00 2018-11-29
5 2018-11-30 15 Friday 2018-11-30 00:00:00 2018-11-30
6 2017-12-01 65 Sunday 2018-12-01 17:49:12 2018-12-01
7 2017-12-02 34 Monday 2018-12-02 17:49:12 2018-12-02
8 2017-12-03 81 Tuesday 2018-12-03 17:49:12 2018-12-03
9 2017-12-04 62 Wednesday 2018-12-04 17:49:12 2018-12-04
new_day new_value
0 Sunday 20.0
1 Monday 30.0
2 Tuesday 25.0
3 Wednesday 35.0
4 Thursday 40.0
5 Friday 15.0
6 Saturday NaN
7 Sunday 65.0
8 Monday 34.0
9 Tuesday 81.0
如果我是你,我只会 "keep" 已经为你制作的 2 列和 "shift" something
列,例如...
mask = df['Date'] <= '2018-01-01'
df['something'][mask] = df['something'][mask].shift(1)
您可以保留 2 列 "new_date" 和 "new_day"。放下其他人并重命名这两个,无论你想做什么。 :)
我有一个 pandas 全年用电量数据的数据框,但想将 table 更新到另一年。我希望数据值落在与以前相同的工作日。
我有:
Date 00:00 ... WeekDay requiredDate requiredWeekDay
25/11/2018 20 Sunday 25/11/2018 Sunday
26/11/2018 30 Monday 26/11/2018 Monday
27/11/2018 25 Tuesday 27/11/2018 Tuesday
28/11/2018 35 Wednesday 28/11/2018 Wednesday
29/11/2018 40 Thursday 29/11/2018 Thursday
30/11/2018 15 Friday 30/11/2018 Friday
01/12/2017 65 Sunday 01/12/2018 Saturday
02/12/2017 34 Monday 02/12/2018 Sunday
03/12/2017 81 Tuesday 03/12/2018 Monday
04/12/2017 62 Wednesday 04/12/2018 Tuesday
...
我想要什么:
Date 00:00 ... WeekDay
25/11/2018 20 Sunday
26/11/2018 30 Monday
27/11/2018 25 Tuesday
28/11/2018 35 Wednesday
29/11/2018 40 Thursday
30/11/2018 15 Friday
01/12/2018 Saturday
02/12/2018 65 Sunday
03/12/2018 34 Monday
04/12/2018 81 Tuesday
...
我尝试过的:
df['Day'] = df['Date'].dt.day
df['Month'] = df['Date'].dt.month
df['Year'] = df['Date'].dt.year
requiredYear = str(df['Year'].median()).replace(".0","")
df = df.sort_values(by = ['Month', 'Day']).reset_index()
df['RemappedDate']= np.nan
for index, row in df.iterrows():
if row['Weekday'] != row['requiredWeekday']:
while row[row['Day']]<31:
row['Day'] = row['Day']-1
row['RemappedDate'] = pd.to_datetime(str(row['Month'])+"/"+
str(row['Day'])+"/"+requiredYear)
else:
print("Already equal")
df['Date'] = df['RemappedDate']
df['Weekday'] = df['requiredWeekday']
可能离得不远,如果没有,我们深表歉意。我是初学者。
IIUC,你可以在没有所需日期列的帮助下增加年份:
print(df)
Date something WeekDay
0 2018-11-25 20 Sunday
1 2018-11-26 30 Monday
2 2018-11-27 25 Tuesday
3 2018-11-28 35 Wednesday
4 2018-11-29 40 Thursday
5 2018-11-30 15 Friday
6 2017-12-01 65 Sunday
7 2017-12-02 34 Monday
8 2017-12-03 81 Tuesday
9 2017-12-04 62 Wednesday
df['new_Date']=df['Date'].mask(df['Date'].dt.year == 2017, df['Date'] + pd.to_timedelta(1, unit='y') + pd.to_timedelta(12, unit='h'))
df['required_date'] = df.new_Date.dt.date
df['new_day']=df.new_Date.dt.day_name()
df['new_value']=np.where(df.WeekDay==df.new_day,df.something,df.new_day.map(dict(zip(df.loc[df.WeekDay!=df.new_day,'WeekDay'],df.loc[df.WeekDay!=df.new_day,'something']))))
print(df)
Date something WeekDay new_Date required_date \
0 2018-11-25 20 Sunday 2018-11-25 00:00:00 2018-11-25
1 2018-11-26 30 Monday 2018-11-26 00:00:00 2018-11-26
2 2018-11-27 25 Tuesday 2018-11-27 00:00:00 2018-11-27
3 2018-11-28 35 Wednesday 2018-11-28 00:00:00 2018-11-28
4 2018-11-29 40 Thursday 2018-11-29 00:00:00 2018-11-29
5 2018-11-30 15 Friday 2018-11-30 00:00:00 2018-11-30
6 2017-12-01 65 Sunday 2018-12-01 17:49:12 2018-12-01
7 2017-12-02 34 Monday 2018-12-02 17:49:12 2018-12-02
8 2017-12-03 81 Tuesday 2018-12-03 17:49:12 2018-12-03
9 2017-12-04 62 Wednesday 2018-12-04 17:49:12 2018-12-04
new_day new_value
0 Sunday 20.0
1 Monday 30.0
2 Tuesday 25.0
3 Wednesday 35.0
4 Thursday 40.0
5 Friday 15.0
6 Saturday NaN
7 Sunday 65.0
8 Monday 34.0
9 Tuesday 81.0
如果我是你,我只会 "keep" 已经为你制作的 2 列和 "shift" something
列,例如...
mask = df['Date'] <= '2018-01-01'
df['something'][mask] = df['something'][mask].shift(1)
您可以保留 2 列 "new_date" 和 "new_day"。放下其他人并重命名这两个,无论你想做什么。 :)