我如何按周和年分组,并根据这一点,将值分配给最初具有 'N' 作为所有值的第三列?
How do I group by Week and Year, and depending on that, assign values to a third column having 'N' as all values initially?
我有一个包含 4 列的数据框。我创建了一个新列并将所有值分配给它 'N'。
假设前两列有一些随机信息,Column3 = Year,Column4 = week No.,现在Column5 =week-ES(初始分配所有'N')应该等于'Week No.'最近 5 周,并且在此之前的所有周都应等于 'Pastwk'。如何将 group by 与前 5 周子句一起使用?我怎么做?我使用了这段代码,但没有得到想要的结果。期望的结果是下面的 table:
df.sort_values(['Year','Week No.'],ascending=[False,False],inplace = True)
df['Week-ES'] = 'N'
df = df.groupby(['Year','Week No.']).size()
df['Week-ES'][:5]= df['Week No.'][:5]
#for i in range(5):
# df.loc[df['Week-ES'].index == i, 'Week-ES'] = df['Week No.'].iloc[i]
df.iloc[5:]['Week-ES'] = 'Past WK'
Col1
Col2
Year
WeekNo.
Week-ES
v1
v2
2020
48
Recent
v2
v3
2020
47
Recent
v3
v4
2020
47
Recent
v4
v5
2020
46
Recent
v5
v6
2020
40
Pastwk
v6
v7
2019
52
PastWk
想法是将值转换为周期间,减去 5 周,然后根据 Year
和 Week
列的日期时间按周期间进行比较 Series.ge
for greater or equal and pass to numpy.where
:
last = pd.to_datetime('now').to_period('W') - 5
print (last)
2020-11-09/2020-11-15
s = df['Year'].astype(str).add(df['Week'].astype(str).add('-1'))
dates = pd.to_datetime(s, format='%Y%W-%w').dt.to_period('W')
df['C'] = np.where(dates.ge(last), 'Recent', 'Pastwk')
print (df)
Year Week C
0 2020 48 Recent
1 2020 47 Recent
2 2020 47 Recent
3 2020 46 Recent
4 2020 40 Pastwk
5 2019 52 Pastwk
import datetime
import numpy as np
# Get current week number and current year
current_week = datetime.date.today.isocalendar()[1]
current_year = datetime.datetime.now().year
df['C'] = np.where(((df['Week'] >= current_week - 5) & (df['Year'] == current_year)), 'Recent', 'Pastwk')
然后你必须在 2021 年管理,而你在一月份,因为 current_week 可以 = 1 但最近几周不会是 -5、-4,...但是53、52 等
我有一个包含 4 列的数据框。我创建了一个新列并将所有值分配给它 'N'。
假设前两列有一些随机信息,Column3 = Year,Column4 = week No.,现在Column5 =week-ES(初始分配所有'N')应该等于'Week No.'最近 5 周,并且在此之前的所有周都应等于 'Pastwk'。如何将 group by 与前 5 周子句一起使用?我怎么做?我使用了这段代码,但没有得到想要的结果。期望的结果是下面的 table:
df.sort_values(['Year','Week No.'],ascending=[False,False],inplace = True)
df['Week-ES'] = 'N'
df = df.groupby(['Year','Week No.']).size()
df['Week-ES'][:5]= df['Week No.'][:5]
#for i in range(5):
# df.loc[df['Week-ES'].index == i, 'Week-ES'] = df['Week No.'].iloc[i]
df.iloc[5:]['Week-ES'] = 'Past WK'
Col1 | Col2 | Year | WeekNo. | Week-ES |
---|---|---|---|---|
v1 | v2 | 2020 | 48 | Recent |
v2 | v3 | 2020 | 47 | Recent |
v3 | v4 | 2020 | 47 | Recent |
v4 | v5 | 2020 | 46 | Recent |
v5 | v6 | 2020 | 40 | Pastwk |
v6 | v7 | 2019 | 52 | PastWk |
想法是将值转换为周期间,减去 5 周,然后根据 Year
和 Week
列的日期时间按周期间进行比较 Series.ge
for greater or equal and pass to numpy.where
:
last = pd.to_datetime('now').to_period('W') - 5
print (last)
2020-11-09/2020-11-15
s = df['Year'].astype(str).add(df['Week'].astype(str).add('-1'))
dates = pd.to_datetime(s, format='%Y%W-%w').dt.to_period('W')
df['C'] = np.where(dates.ge(last), 'Recent', 'Pastwk')
print (df)
Year Week C
0 2020 48 Recent
1 2020 47 Recent
2 2020 47 Recent
3 2020 46 Recent
4 2020 40 Pastwk
5 2019 52 Pastwk
import datetime
import numpy as np
# Get current week number and current year
current_week = datetime.date.today.isocalendar()[1]
current_year = datetime.datetime.now().year
df['C'] = np.where(((df['Week'] >= current_week - 5) & (df['Year'] == current_year)), 'Recent', 'Pastwk')
然后你必须在 2021 年管理,而你在一月份,因为 current_week 可以 = 1 但最近几周不会是 -5、-4,...但是53、52 等