如何根据周保存到 .csv 文件中

How to save into .csv files based on weeks

我有一个来自 .csv 文件的数据集 header created_at,text & lable 如下

created_at,text,label
2021-07-24,Newzeland Wins the worldcup,Sport
2021-07-25,ABC Wins the worldcup,Sport
2021-07-26,Hello the worldcup,Sport
2021-07-27,Cricket worldcup,Sport
2021-07-28,Rugby worldcup,Sport
2021-07-29,LLL Wins,Sport
2021-07-30,MMM Wins the worldcup,Sport
2021-07-31,RRR Wins the worldcup,Sport
2021-08-01,OOO Wins the worldcup,Sport
2021-08-02,JJJ Wins the worldcup,Sport
2021-08-03,YYY Wins the worldcup,Sport
2021-08-04,KKK Wins the worldcup,Sport
2021-08-05,YYY Wins the worldcup,Sport
2021-08-06,GGG Wins the worldcup,Sport
2021-08-07,FFF Wins the worldcup,Sport
2021-08-08,SSS Wins the worldcup,Sport
2021-08-09,XYZ Wins the worldcup,Sport
2021-08-10,PQR Wins the worldcup,Sport

如何根据周将这些保存到 .csv 文件中。 例如:我只想将上述数据集(从 2021-07-24 到 2021-07-30)的前 7 天值保存到 week1.csv 文件中 & week2.csv(2021-07-31到 2021-08-05) 等等

week1.csv

created_at,text,label
2021-07-24,Newzeland Wins the worldcup,Sport
2021-07-25,ABC Wins the worldcup,Sport
2021-07-26,Hello the worldcup,Sport
2021-07-27,Cricket worldcup,Sport
2021-07-28,Rugby worldcup,Sport
2021-07-29,LLL Wins,Sport
2021-07-30,MMM Wins the worldcup,Sport

IIUC 你可以计算一个星期的周期并使用 groupby:

group = pd.to_datetime(df['created_at']).dt.to_period('W-FRI')

for i, (g, d) in enumerate(df.groupby(group), start=1):
    print(f'saving week {i}: {g}')
    d.to_csv(f'week{i}.csv')

注意。使用以周五结束的周为期间。

要从第一天开始以编程方式计算,请使用:

s = pd.to_datetime(df['created_at'])
dow = (s.iloc[0]-pd.Timedelta('1d')).strftime("%a")
group = s.dt.to_period(f'W-{dow}')

输出:

saving week 1: 2021-07-24/2021-07-30
saving week 2: 2021-07-31/2021-08-06
saving week 3: 2021-08-07/2021-08-13

个文件:

week1.csv
   created_at                         text  label
0  2021-07-24  Newzeland Wins the worldcup  Sport
1  2021-07-25        ABC Wins the worldcup  Sport
2  2021-07-26           Hello the worldcup  Sport
3  2021-07-27             Cricket worldcup  Sport
4  2021-07-28               Rugby worldcup  Sport
5  2021-07-29                     LLL Wins  Sport
6  2021-07-30        MMM Wins the worldcup  Sport

week2.csv
    created_at                   text  label
7   2021-07-31  RRR Wins the worldcup  Sport
8   2021-08-01  OOO Wins the worldcup  Sport
9   2021-08-02  JJJ Wins the worldcup  Sport
10  2021-08-03  YYY Wins the worldcup  Sport
11  2021-08-04  KKK Wins the worldcup  Sport
12  2021-08-05  YYY Wins the worldcup  Sport
13  2021-08-06  GGG Wins the worldcup  Sport

week3.csv
    created_at                   text  label
14  2021-08-07  FFF Wins the worldcup  Sport
15  2021-08-08  SSS Wins the worldcup  Sport
16  2021-08-09  XYZ Wins the worldcup  Sport
17  2021-08-10  PQR Wins the worldcup  Sport