Num day to Name day Pandas

Num day to Name day with Pandas

如果我使用这个函数 pd.DatetimeIndex(dfTrain['datetime']).weekday 我得到了星期几,但是我没有找到任何给出日期名称的函数...所以我需要将 0 转换为星期一,1到星期二等等。

这是我的数据框示例:

            datetime    season holiday workingday weather   temp    atemp   humidity    windspeed   count
    0   2011-01-01 00:00:00 1   0   0   1   9.84    14.395  81  0.0000  16
    1   2011-01-01 01:00:00 1   0   0   1   9.02    13.635  80  0.0000  40
    2   2011-01-01 02:00:00 1   0   0   1   9.02    13.635  80  0.0000  32
    3   2011-01-01 03:00:00 1   0   0   1   9.84    14.395  75  0.0000  13
    4   2011-01-01 04:00:00 1   0   0   1   9.84    14.395  75  0.0000  1
    5   2011-01-01 05:00:00 1   0   0   2   9.84    12.880  75  6.0032  1
    6   2011-01-01 06:00:00 1   0   0   1   9.02    13.635  80  0.0000  2
    7   2011-01-01 07:00:00 1   0   0   1   8.20    12.880  86  0.0000  3
    8   2011-01-01 08:00:00 1   0   0   1   9.84    14.395  75  0.0000  8
    9   2011-01-01 09:00:00 1   0   0   1   13.12   17.425  76  0.0000  14

还有一个问题,pandas.DatetimeIndex.dayofweekpandas.DatetimeIndex.weekday的区别是什么?

一种方法,只要 datetime 已经是 datetime 列,就是应用 datetime.strftime 来获取工作日的字符串:

In [105]:

df['weekday'] = df[['datetime']].apply(lambda x: dt.datetime.strftime(x['datetime'], '%A'), axis=1)
df
Out[105]:
             datetime  season  holiday  workingday  weather   temp   atemp  \
0 2011-01-01 00:00:00       1        0           0        1   9.84  14.395   
1 2011-01-01 01:00:00       1        0           0        1   9.02  13.635   
2 2011-01-01 02:00:00       1        0           0        1   9.02  13.635   
3 2011-01-01 03:00:00       1        0           0        1   9.84  14.395   
4 2011-01-01 04:00:00       1        0           0        1   9.84  14.395   
5 2011-01-01 05:00:00       1        0           0        2   9.84  12.880   
6 2011-01-01 06:00:00       1        0           0        1   9.02  13.635   
7 2011-01-01 07:00:00       1        0           0        1   8.20  12.880   
8 2011-01-01 08:00:00       1        0           0        1   9.84  14.395   
9 2011-01-01 09:00:00       1        0           0        1  13.12  17.425   

   humidity  windspeed  count   weekday  
0        81     0.0000     16  Saturday  
1        80     0.0000     40  Saturday  
2        80     0.0000     32  Saturday  
3        75     0.0000     13  Saturday  
4        75     0.0000      1  Saturday  
5        75     6.0032      1  Saturday  
6        80     0.0000      2  Saturday  
7        86     0.0000      3  Saturday  
8        75     0.0000      8  Saturday  
9        76     0.0000     14  Saturday  

关于你的另一个问题,dayofweekweekday没有区别。

将工作日映射定义为等效字符串并在工作日调用映射会更快:

dayOfWeek={0:'Monday', 1:'Tuesday', 2:'Wednesday', 3:'Thursday', 4:'Friday', 5:'Saturday', 6:'Sunday'}
df['weekday'] = df['datetime'].dt.dayofweek.map(dayOfWeek)

对于 0.15.0 之前的版本,以下应该有效:

import datetime as dt
df['weekday'] = df['datetime'].apply(lambda x: dt.datetime.strftime(x, '%A'))

版本 0.18.1 及更高版本

现在有一种新的便捷方法 dt.weekday_name 可以执行上述操作

版本 0.23.0 及更新版本

weekday_name 现在被描述为 dt.day_name

您可以使用的最新版本dt.day_name

df['weekday'] = df['datetime'].dt.day_name
print df
             datetime  season  holiday  workingday  weather   temp   atemp  \
0 2011-01-01 00:00:00       1        0           0        1   9.84  14.395   
1 2011-01-01 01:00:00       1        0           0        1   9.02  13.635   
2 2011-01-01 02:00:00       1        0           0        1   9.02  13.635   
3 2011-01-01 03:00:00       1        0           0        1   9.84  14.395   
4 2011-01-01 04:00:00       1        0           0        1   9.84  14.395   
5 2011-01-01 05:00:00       1        0           0        2   9.84  12.880   
6 2011-01-01 06:00:00       1        0           0        1   9.02  13.635   
7 2011-01-01 07:00:00       1        0           0        1   8.20  12.880   
8 2011-01-01 08:00:00       1        0           0        1   9.84  14.395   
9 2011-01-01 09:00:00       1        0           0        1  13.12  17.425   

   humidity  windspeed  count   weekday  
0        81     0.0000     16  Saturday  
1        80     0.0000     40  Saturday  
2        80     0.0000     32  Saturday  
3        75     0.0000     13  Saturday  
4        75     0.0000      1  Saturday  
5        75     6.0032      1  Saturday  
6        80     0.0000      2  Saturday  
7        86     0.0000      3  Saturday  
8        75     0.0000      8  Saturday  
9        76     0.0000     14  Saturday  

使用 dt.weekday_namedeprecated since pandas 0.23.0, instead, use dt.day_name():

df.datetime.dt.day_name()

0    Saturday
1    Saturday
2    Saturday
3    Saturday
4    Saturday
5    Saturday
6    Saturday
7    Saturday
8    Saturday
9    Saturday
Name: datetime, dtype: object

添加到@jezrael 之前的正确回答,你可以使用这个:

import calendar
df['weekday'] = pd.Series(pd.Categorical(df['datetime'].dt.weekday_name, categories=list(calendar.day_name)))

它还根据 [=11] 为您的新分类变量提供 order(在此示例中:'Monday',...,'Sunday') =].这可能对您的后续分析步骤有所帮助。