Python 初学者:从 CSV 文件中提取特定的每一行并将其写入不同的 CSV 文件

Python Beginner : Extract a specific each row from CSV file and write it to different CSV files

我有一个包含 40 行气象站数据的 .csv 文件,与此类似:

Date        Station                  PET  Max Temp  Min Temp

2/11/2016   Conroe                   0.09   70       33
2/11/2016   Huntsville               0.11   69       33
2/11/2016   Overton                  0.14   67       34
2/11/2016   Allen                    0.11   71       32
2/11/2016   Dallas AgriLife Center   0.17   71       37
2/11/2016   Forney                   0.13   70       35

我正在尝试使用 pandas 从该文件中提取每个站点的数据,并将其写入每个站点的不同 .csv 文件。

我试过使用此代码:

import pandas as pd

df = pd.read_csv('C:\Desktop\report.csv')

for Station in df:
    df[Station].to_csv('C:\data\'+ Station +'.csv')

但是这段代码是像这样按每一列提取数据,image of files created

请帮我解决这个... 有没有一种方法可以逐行遍历并提取数据,而不是写入所有数据元素,例如循环遍历每一行并为每个站点创建一个 CSV 文件。

df =pd.DataFrame({'Date': {0: '2/11/2016', 1: '2/11/2016', 2: '2/11/2016', 3: '2/11/2016', 4: '2/11/2016', 5: '2/11/2016'}, 'PET': {0: 0.089999999999999997, 1: 0.11, 2: 0.14000000000000001, 3: 0.11, 4: 0.17000000000000001, 5: 0.13}, 'Max Temp': {0: 70, 1: 69, 2: 67, 3: 71, 4: 71, 5: 70}, 'Station': {0: 'Conroe', 1: 'Huntsville', 2: 'Overton', 3: 'Allen', 4: 'Dallas Agri Life Center', 5: 'Forney'}, 'Min Temp': {0: 33, 1: 33, 2: 34, 3: 32, 4: 37, 5: 35}})

df.groupby('Station').apply(lambda x : pd.DataFrame.to_csv(x, x['Station'].values[0] + '.csv'))

df[Station] 只需选择列。您想执行以下操作: 在伪代码中:

for each station in stations:
    select the row and put it a separate data_frame

when done write each data frame to a file.

这在 pandas 中也不难实现。方法如下:

 for name in df.Station:
   ....:     print df[df.Station == name]
   ....:     
        Date Station   PET  Max Temp  Min Temp
0  2/11/2016  Conroe  0.09        70        33
        Date     Station   PET  Max Temp  Min Temp
1  2/11/2016  Huntsville  0.11        69        33
        Date  Station   PET  Max Temp  Min Temp
2  2/11/2016  Overton  0.14        67        34
        Date Station   PET  Max Temp  Min Temp
3  2/11/2016   Allen  0.11        71        32
        Date                 Station   PET  Max Temp  Min Temp
4  2/11/2016  Dallas AgriLife Center  0.17        71        37
        Date Station   PET  Max Temp  Min Temp
5  2/11/2016  Forney  0.13        70        35

这只是打印,但您可以将打印替换为写入新的 csv:

In [54]: for name in df.Station:
   ....:     df[df.Station == name].to_csv(name+'.csv')
   ....:     

In [55]: ls
Allen.csv  Conroe.csv  Dallas AgriLife Center.csv  foo.csv  Forney.csv  Huntsville.csv  Overton.csv  stations.csv

现在每个文件都包含您想要的数据。