如何根据周保存到 .csv 文件中
How to save into .csv files based on weeks
我有一个来自 .csv 文件的数据集 header created_at
,text
& lable
如下
created_at,text,label
2021-07-24,Newzeland Wins the worldcup,Sport
2021-07-25,ABC Wins the worldcup,Sport
2021-07-26,Hello the worldcup,Sport
2021-07-27,Cricket worldcup,Sport
2021-07-28,Rugby worldcup,Sport
2021-07-29,LLL Wins,Sport
2021-07-30,MMM Wins the worldcup,Sport
2021-07-31,RRR Wins the worldcup,Sport
2021-08-01,OOO Wins the worldcup,Sport
2021-08-02,JJJ Wins the worldcup,Sport
2021-08-03,YYY Wins the worldcup,Sport
2021-08-04,KKK Wins the worldcup,Sport
2021-08-05,YYY Wins the worldcup,Sport
2021-08-06,GGG Wins the worldcup,Sport
2021-08-07,FFF Wins the worldcup,Sport
2021-08-08,SSS Wins the worldcup,Sport
2021-08-09,XYZ Wins the worldcup,Sport
2021-08-10,PQR Wins the worldcup,Sport
如何根据周将这些保存到 .csv 文件中。
例如:我只想将上述数据集(从 2021-07-24 到 2021-07-30)的前 7 天值保存到 week1.csv 文件中 & week2.csv(2021-07-31到 2021-08-05) 等等
week1.csv
created_at,text,label
2021-07-24,Newzeland Wins the worldcup,Sport
2021-07-25,ABC Wins the worldcup,Sport
2021-07-26,Hello the worldcup,Sport
2021-07-27,Cricket worldcup,Sport
2021-07-28,Rugby worldcup,Sport
2021-07-29,LLL Wins,Sport
2021-07-30,MMM Wins the worldcup,Sport
IIUC 你可以计算一个星期的周期并使用 groupby
:
group = pd.to_datetime(df['created_at']).dt.to_period('W-FRI')
for i, (g, d) in enumerate(df.groupby(group), start=1):
print(f'saving week {i}: {g}')
d.to_csv(f'week{i}.csv')
注意。使用以周五结束的周为期间。
要从第一天开始以编程方式计算,请使用:
s = pd.to_datetime(df['created_at'])
dow = (s.iloc[0]-pd.Timedelta('1d')).strftime("%a")
group = s.dt.to_period(f'W-{dow}')
输出:
saving week 1: 2021-07-24/2021-07-30
saving week 2: 2021-07-31/2021-08-06
saving week 3: 2021-08-07/2021-08-13
个文件:
week1.csv
created_at text label
0 2021-07-24 Newzeland Wins the worldcup Sport
1 2021-07-25 ABC Wins the worldcup Sport
2 2021-07-26 Hello the worldcup Sport
3 2021-07-27 Cricket worldcup Sport
4 2021-07-28 Rugby worldcup Sport
5 2021-07-29 LLL Wins Sport
6 2021-07-30 MMM Wins the worldcup Sport
week2.csv
created_at text label
7 2021-07-31 RRR Wins the worldcup Sport
8 2021-08-01 OOO Wins the worldcup Sport
9 2021-08-02 JJJ Wins the worldcup Sport
10 2021-08-03 YYY Wins the worldcup Sport
11 2021-08-04 KKK Wins the worldcup Sport
12 2021-08-05 YYY Wins the worldcup Sport
13 2021-08-06 GGG Wins the worldcup Sport
week3.csv
created_at text label
14 2021-08-07 FFF Wins the worldcup Sport
15 2021-08-08 SSS Wins the worldcup Sport
16 2021-08-09 XYZ Wins the worldcup Sport
17 2021-08-10 PQR Wins the worldcup Sport
我有一个来自 .csv 文件的数据集 header created_at
,text
& lable
如下
created_at,text,label
2021-07-24,Newzeland Wins the worldcup,Sport
2021-07-25,ABC Wins the worldcup,Sport
2021-07-26,Hello the worldcup,Sport
2021-07-27,Cricket worldcup,Sport
2021-07-28,Rugby worldcup,Sport
2021-07-29,LLL Wins,Sport
2021-07-30,MMM Wins the worldcup,Sport
2021-07-31,RRR Wins the worldcup,Sport
2021-08-01,OOO Wins the worldcup,Sport
2021-08-02,JJJ Wins the worldcup,Sport
2021-08-03,YYY Wins the worldcup,Sport
2021-08-04,KKK Wins the worldcup,Sport
2021-08-05,YYY Wins the worldcup,Sport
2021-08-06,GGG Wins the worldcup,Sport
2021-08-07,FFF Wins the worldcup,Sport
2021-08-08,SSS Wins the worldcup,Sport
2021-08-09,XYZ Wins the worldcup,Sport
2021-08-10,PQR Wins the worldcup,Sport
如何根据周将这些保存到 .csv 文件中。 例如:我只想将上述数据集(从 2021-07-24 到 2021-07-30)的前 7 天值保存到 week1.csv 文件中 & week2.csv(2021-07-31到 2021-08-05) 等等
week1.csv
created_at,text,label
2021-07-24,Newzeland Wins the worldcup,Sport
2021-07-25,ABC Wins the worldcup,Sport
2021-07-26,Hello the worldcup,Sport
2021-07-27,Cricket worldcup,Sport
2021-07-28,Rugby worldcup,Sport
2021-07-29,LLL Wins,Sport
2021-07-30,MMM Wins the worldcup,Sport
IIUC 你可以计算一个星期的周期并使用 groupby
:
group = pd.to_datetime(df['created_at']).dt.to_period('W-FRI')
for i, (g, d) in enumerate(df.groupby(group), start=1):
print(f'saving week {i}: {g}')
d.to_csv(f'week{i}.csv')
注意。使用以周五结束的周为期间。
要从第一天开始以编程方式计算,请使用:
s = pd.to_datetime(df['created_at'])
dow = (s.iloc[0]-pd.Timedelta('1d')).strftime("%a")
group = s.dt.to_period(f'W-{dow}')
输出:
saving week 1: 2021-07-24/2021-07-30
saving week 2: 2021-07-31/2021-08-06
saving week 3: 2021-08-07/2021-08-13
个文件:
week1.csv
created_at text label
0 2021-07-24 Newzeland Wins the worldcup Sport
1 2021-07-25 ABC Wins the worldcup Sport
2 2021-07-26 Hello the worldcup Sport
3 2021-07-27 Cricket worldcup Sport
4 2021-07-28 Rugby worldcup Sport
5 2021-07-29 LLL Wins Sport
6 2021-07-30 MMM Wins the worldcup Sport
week2.csv
created_at text label
7 2021-07-31 RRR Wins the worldcup Sport
8 2021-08-01 OOO Wins the worldcup Sport
9 2021-08-02 JJJ Wins the worldcup Sport
10 2021-08-03 YYY Wins the worldcup Sport
11 2021-08-04 KKK Wins the worldcup Sport
12 2021-08-05 YYY Wins the worldcup Sport
13 2021-08-06 GGG Wins the worldcup Sport
week3.csv
created_at text label
14 2021-08-07 FFF Wins the worldcup Sport
15 2021-08-08 SSS Wins the worldcup Sport
16 2021-08-09 XYZ Wins the worldcup Sport
17 2021-08-10 PQR Wins the worldcup Sport