将数据映射到 Python pandas 中另一年的同一工作日

Mapping data to the same weekday in another year in Python pandas

我有一个 pandas 全年用电量数据的数据框,但想将 table 更新到另一年。我希望数据值落在与以前相同的工作日。

我有:

Date          00:00   ...     WeekDay   requiredDate  requiredWeekDay
25/11/2018       20            Sunday     25/11/2018           Sunday
26/11/2018       30            Monday     26/11/2018           Monday
27/11/2018       25           Tuesday     27/11/2018          Tuesday
28/11/2018       35         Wednesday     28/11/2018        Wednesday
29/11/2018       40          Thursday     29/11/2018         Thursday
30/11/2018       15            Friday     30/11/2018           Friday
01/12/2017       65            Sunday     01/12/2018         Saturday
02/12/2017       34            Monday     02/12/2018           Sunday
03/12/2017       81           Tuesday     03/12/2018           Monday
04/12/2017       62         Wednesday     04/12/2018          Tuesday
...

我想要什么:

Date          00:00   ...     WeekDay     
25/11/2018       20            Sunday               
26/11/2018       30            Monday              
27/11/2018       25           Tuesday         
28/11/2018       35         Wednesday        
29/11/2018       40          Thursday           
30/11/2018       15            Friday               
01/12/2018                   Saturday            
02/12/2018       65            Sunday              
03/12/2018       34            Monday               
04/12/2018       81           Tuesday            
...

我尝试过的:

df['Day'] = df['Date'].dt.day
df['Month'] = df['Date'].dt.month
df['Year'] = df['Date'].dt.year
requiredYear = str(df['Year'].median()).replace(".0","")

df = df.sort_values(by = ['Month', 'Day']).reset_index()

df['RemappedDate']= np.nan

for index, row in df.iterrows():
  if row['Weekday'] != row['requiredWeekday']:
    while row[row['Day']]<31:
      row['Day'] = row['Day']-1    
      row['RemappedDate'] = pd.to_datetime(str(row['Month'])+"/"+ 
                            str(row['Day'])+"/"+requiredYear)
  else:
    print("Already equal")

df['Date'] = df['RemappedDate']
df['Weekday'] = df['requiredWeekday']

可能离得不远,如果没有,我们深表歉意。我是初学者。

IIUC,你可以在没有所需日期列的帮助下增加年份:

print(df)

        Date  something    WeekDay
0 2018-11-25         20     Sunday
1 2018-11-26         30     Monday
2 2018-11-27         25    Tuesday
3 2018-11-28         35  Wednesday
4 2018-11-29         40   Thursday
5 2018-11-30         15     Friday
6 2017-12-01         65     Sunday
7 2017-12-02         34     Monday
8 2017-12-03         81    Tuesday
9 2017-12-04         62  Wednesday

df['new_Date']=df['Date'].mask(df['Date'].dt.year == 2017, df['Date'] + pd.to_timedelta(1, unit='y') + pd.to_timedelta(12, unit='h'))
df['required_date'] = df.new_Date.dt.date
df['new_day']=df.new_Date.dt.day_name()
df['new_value']=np.where(df.WeekDay==df.new_day,df.something,df.new_day.map(dict(zip(df.loc[df.WeekDay!=df.new_day,'WeekDay'],df.loc[df.WeekDay!=df.new_day,'something']))))

print(df)

        Date  something    WeekDay            new_Date required_date  \
0 2018-11-25         20     Sunday 2018-11-25 00:00:00    2018-11-25   
1 2018-11-26         30     Monday 2018-11-26 00:00:00    2018-11-26   
2 2018-11-27         25    Tuesday 2018-11-27 00:00:00    2018-11-27   
3 2018-11-28         35  Wednesday 2018-11-28 00:00:00    2018-11-28   
4 2018-11-29         40   Thursday 2018-11-29 00:00:00    2018-11-29   
5 2018-11-30         15     Friday 2018-11-30 00:00:00    2018-11-30   
6 2017-12-01         65     Sunday 2018-12-01 17:49:12    2018-12-01   
7 2017-12-02         34     Monday 2018-12-02 17:49:12    2018-12-02   
8 2017-12-03         81    Tuesday 2018-12-03 17:49:12    2018-12-03   
9 2017-12-04         62  Wednesday 2018-12-04 17:49:12    2018-12-04   

     new_day  new_value  
0     Sunday       20.0  
1     Monday       30.0  
2    Tuesday       25.0  
3  Wednesday       35.0  
4   Thursday       40.0  
5     Friday       15.0  
6   Saturday        NaN  
7     Sunday       65.0  
8     Monday       34.0  
9    Tuesday       81.0 

如果我是你,我只会 "keep" 已经为你制作的 2 列和 "shift" something 列,例如...

mask = df['Date'] <= '2018-01-01'
df['something'][mask] = df['something'][mask].shift(1)

您可以保留 2 列 "new_date" 和 "new_day"。放下其他人并重命名这两个,无论你想做什么。 :)