在多个 CSV 中搜索一个特定值
Search multiple CSVs for one particular value
我有一个列出水果和数量的 CSV 文件目录,我想搜索所有这些文件并创建一个只有特定水果(例如“苹果”)的新 CSV。
list1.csv
| Name | Qty |
| -------- | --- |
| apple |15 |
| apple |50 |
| mango |20 |
| grapes |49 |
list2.csv
| Name | Qty |
| -------- | --- |
| apple |25 |
| apple |50 |
| Banana |34 |
| mango |20 |
| grapes |49 |
list3.csv
| Name | Qty |
| -------- | --- |
| apple |125 |
| apple |530 |
| mango |20 |
| grapes |49 |
我想要,“苹果”:
new.csv
| Name | Qty |
| -------- | --- |
| apple |15 |
| apple |50 |
| apple |25 |
| apple |50 |
| apple |125 |
| apple |530 |
import pandas as pd
import glob, os
path = ("E:/Data/Fdata")
all_files = glob.glob(path + "/*.csv")
li=[]
for filename in all_files:
df=pd.read_csv(filename, index_col=None, header=0)
ndf = df[df["Name"].str.contains("Apple")]
li.append(ndf)
ndf.to_csv("E:/Data/Fdata/onlyapple.csv", index=True)
- 将所有 csv 文件读取到
master
DataFrame
- 过滤你想要的“姓名”并写
to_csv
import os
import pandas as pd
master = pd.DataFrame()
for file in [f for f in os.listdir(".") if f.endswith("csv")]:
master = master.append(pd.read_csv(file), ignore_index=True)
master[master["Name"].eq("apple")].reset_index(drop=True).to_csv("onlyapple.csv")
onlyapple.csv:
,Name,quantity,price
0,apple,15,500
1,apple,50,400
2,apple,15,500
3,apple,50,400
以下是没有 Pandas 的方法:
import csv
import glob
fruit = 'apple'
final = []
header = []
for file in glob.glob('./*.csv'):
with open(file, newline='') as f:
reader = csv.reader(f)
header = next(reader) # header should be same for each file
for row in reader:
if row[0] == fruit:
final.append(row)
with open('output.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerow(header) # use the last file's header
writer.writerows(final)
下面是如何使用 GoCSV 的命令、stack(将您的文件一个堆叠在一起)和 filter(只过滤掉你想要的行):
gocsv stack *.csv | gocsv filter -c Name -eq apple
我有一个列出水果和数量的 CSV 文件目录,我想搜索所有这些文件并创建一个只有特定水果(例如“苹果”)的新 CSV。
list1.csv
| Name | Qty |
| -------- | --- |
| apple |15 |
| apple |50 |
| mango |20 |
| grapes |49 |
list2.csv
| Name | Qty |
| -------- | --- |
| apple |25 |
| apple |50 |
| Banana |34 |
| mango |20 |
| grapes |49 |
list3.csv
| Name | Qty |
| -------- | --- |
| apple |125 |
| apple |530 |
| mango |20 |
| grapes |49 |
我想要,“苹果”:
new.csv
| Name | Qty |
| -------- | --- |
| apple |15 |
| apple |50 |
| apple |25 |
| apple |50 |
| apple |125 |
| apple |530 |
import pandas as pd
import glob, os
path = ("E:/Data/Fdata")
all_files = glob.glob(path + "/*.csv")
li=[]
for filename in all_files:
df=pd.read_csv(filename, index_col=None, header=0)
ndf = df[df["Name"].str.contains("Apple")]
li.append(ndf)
ndf.to_csv("E:/Data/Fdata/onlyapple.csv", index=True)
- 将所有 csv 文件读取到
master
DataFrame - 过滤你想要的“姓名”并写
to_csv
import os
import pandas as pd
master = pd.DataFrame()
for file in [f for f in os.listdir(".") if f.endswith("csv")]:
master = master.append(pd.read_csv(file), ignore_index=True)
master[master["Name"].eq("apple")].reset_index(drop=True).to_csv("onlyapple.csv")
onlyapple.csv:
,Name,quantity,price
0,apple,15,500
1,apple,50,400
2,apple,15,500
3,apple,50,400
以下是没有 Pandas 的方法:
import csv
import glob
fruit = 'apple'
final = []
header = []
for file in glob.glob('./*.csv'):
with open(file, newline='') as f:
reader = csv.reader(f)
header = next(reader) # header should be same for each file
for row in reader:
if row[0] == fruit:
final.append(row)
with open('output.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerow(header) # use the last file's header
writer.writerows(final)
下面是如何使用 GoCSV 的命令、stack(将您的文件一个堆叠在一起)和 filter(只过滤掉你想要的行):
gocsv stack *.csv | gocsv filter -c Name -eq apple