Pandas Dataframe,为星期几赋值。特征提取
Pandas Dataframe, assign values to day of the week. Feature extraction
我正在提取特征以查找工作日。我到目前为止是这样的:
days = {0:'Mon', 1: 'Tues', 2:'Wed', 3:'Thurs', 4:'Fri', 5:'Sat', 6:'Sun'}
data['day_of_week'] = data['day_of_week'].apply(lambda x: days[x])
data['if_Weekday'] = np.where( (data['day_of_week'] == 'Mon') | (data['day_of_week'] == 'Tues') | (data['day_of_week'] == 'Wed') | (data['day_of_week'] == 'Thurs') | (data['day_of_week'] == 'Friday'), '1', '0')
此代码会将周一至周五指定为 1
,将周六至周日指定为 0
。但是,我想为工作日分配不同的值。例如,Mon = 1,Tues = 2,Wed = 3,Thurs = 4,Fri = 5,Sat 和 Sun 都应该等于 0。
如有任何帮助,我们将不胜感激。
OK 我认为这里最简单的事情是使用 np.where
来测试工作日是否大于或等于 5 如果是则分配 0
,否则 return 工作日值并向其添加 1
:
In [21]:
df['is_weekday'] = np.where(df['weekday'] >= 5, 0, df['weekday'] + 1)
df
Out[21]:
dates weekday is_weekday
0 2016-01-01 4 5
1 2016-01-02 5 0
2 2016-01-03 6 0
3 2016-01-04 0 1
4 2016-01-05 1 2
5 2016-01-06 2 3
6 2016-01-07 3 4
7 2016-01-08 4 5
8 2016-01-09 5 0
9 2016-01-10 6 0
我想你可以使用 mask
:
data = pd.DataFrame({'day_of_week':[0,1,2,3,4,5,6]})
#original column to new, add 1
data['if_Weekday'] = data['day_of_week'] + 1
#map days if necessary
days = {0:'Mon', 1: 'Tues', 2:'Wed', 3:'Thurs', 4:'Fri', 5:'Sat', 6:'Sun'}
data['day_of_week'] = data['day_of_week'].map(days)
#correct weekend days
data['if_Weekday'] = data['if_Weekday'].mask(data['if_Weekday'] >= 6, 0)
print (data)
day_of_week if_Weekday
0 Mon 1
1 Tues 2
2 Wed 3
3 Thurs 4
4 Fri 5
5 Sat 0
6 Sun 0
我正在提取特征以查找工作日。我到目前为止是这样的:
days = {0:'Mon', 1: 'Tues', 2:'Wed', 3:'Thurs', 4:'Fri', 5:'Sat', 6:'Sun'}
data['day_of_week'] = data['day_of_week'].apply(lambda x: days[x])
data['if_Weekday'] = np.where( (data['day_of_week'] == 'Mon') | (data['day_of_week'] == 'Tues') | (data['day_of_week'] == 'Wed') | (data['day_of_week'] == 'Thurs') | (data['day_of_week'] == 'Friday'), '1', '0')
此代码会将周一至周五指定为 1
,将周六至周日指定为 0
。但是,我想为工作日分配不同的值。例如,Mon = 1,Tues = 2,Wed = 3,Thurs = 4,Fri = 5,Sat 和 Sun 都应该等于 0。
如有任何帮助,我们将不胜感激。
OK 我认为这里最简单的事情是使用 np.where
来测试工作日是否大于或等于 5 如果是则分配 0
,否则 return 工作日值并向其添加 1
:
In [21]:
df['is_weekday'] = np.where(df['weekday'] >= 5, 0, df['weekday'] + 1)
df
Out[21]:
dates weekday is_weekday
0 2016-01-01 4 5
1 2016-01-02 5 0
2 2016-01-03 6 0
3 2016-01-04 0 1
4 2016-01-05 1 2
5 2016-01-06 2 3
6 2016-01-07 3 4
7 2016-01-08 4 5
8 2016-01-09 5 0
9 2016-01-10 6 0
我想你可以使用 mask
:
data = pd.DataFrame({'day_of_week':[0,1,2,3,4,5,6]})
#original column to new, add 1
data['if_Weekday'] = data['day_of_week'] + 1
#map days if necessary
days = {0:'Mon', 1: 'Tues', 2:'Wed', 3:'Thurs', 4:'Fri', 5:'Sat', 6:'Sun'}
data['day_of_week'] = data['day_of_week'].map(days)
#correct weekend days
data['if_Weekday'] = data['if_Weekday'].mask(data['if_Weekday'] >= 6, 0)
print (data)
day_of_week if_Weekday
0 Mon 1
1 Tues 2
2 Wed 3
3 Thurs 4
4 Fri 5
5 Sat 0
6 Sun 0