并非列中的所有数据都被复制到另一个 csv 文件
Not all data in a column is being copied to another csv file
所以我有两个 csv 文件。一种格式如下:
last name, first name, Number
另一种是这种格式:
number, quiz
我想创建一个新的输出文件,它采用这两个 csv 文件并给我一个格式如下的文件:
last name, first name, number, quiz.
我已经尝试了以下代码并且它有效,但仅适用于前两个输入文件中列出的第一个人。我不确定我做错了什么。另外,我不想假设两个输入文件遵循相同的顺序。
import sys, re
import numpy as np
import smtplib
from random import randint
import csv
import math
col = sys.argv[1]
source = sys.argv[2]
target = sys.argv[3]
newtarg = sys.argv[4]
input_source = csv.DictReader(open(source))
input_target = csv.DictReader(open(target))
data = {}
t = ()
for row in input_target:
t = row['First Name'], row['number']
for rows in input_source:
if rows['number'] == row['number']:
t = t + (rows[col],)
name = row['Last Name']
data[name] = [t]
rows.next()
row.next()
with open(newtarg,'w') as out:
csv_out=csv.writer(out)
for key, val in data.items():
csv_out.writerow([key] + list(val))
这可能是 pandas 的工作,Python 数据分析库:
import pandas as pd
x1 = pd.read_csv('x1.csv')
x2 = pd.read_csv('x2.csv')
result = pd.merge(x1, x2, on='number')
result.to_csv('result.csv',
index=False,
columns=['Last Name', 'First Name', 'number', 'quiz'])
参考:https://chrisalbon.com/python/pandas_join_merge_dataframe.html
我认为以下方法可行。 注意:我已经删除了您问题中未使用的代码中的所有内容(正如您在发布之前应该做的那样)。我还硬编码了用于测试的输入值。
import csv
source = 'source1.csv'
target = 'target1.csv'
newtarg = 'new_output.csv'
targets = {}
with open(target) as file:
for row in csv.DictReader(file):
targets[row['number']] = row['quiz']
with open(source) as src, open(newtarg, 'w') as out:
reader = csv.DictReader(src)
writer = csv.writer(out)
writer.writerow(reader.fieldnames + ['quiz']) # create a header row (optional)
for row in reader:
row.update({'quiz': targets.get(row['Number'], 'no match')})
writer.writerow(row.values())
所以我有两个 csv 文件。一种格式如下:
last name, first name, Number
另一种是这种格式:
number, quiz
我想创建一个新的输出文件,它采用这两个 csv 文件并给我一个格式如下的文件:
last name, first name, number, quiz.
我已经尝试了以下代码并且它有效,但仅适用于前两个输入文件中列出的第一个人。我不确定我做错了什么。另外,我不想假设两个输入文件遵循相同的顺序。
import sys, re
import numpy as np
import smtplib
from random import randint
import csv
import math
col = sys.argv[1]
source = sys.argv[2]
target = sys.argv[3]
newtarg = sys.argv[4]
input_source = csv.DictReader(open(source))
input_target = csv.DictReader(open(target))
data = {}
t = ()
for row in input_target:
t = row['First Name'], row['number']
for rows in input_source:
if rows['number'] == row['number']:
t = t + (rows[col],)
name = row['Last Name']
data[name] = [t]
rows.next()
row.next()
with open(newtarg,'w') as out:
csv_out=csv.writer(out)
for key, val in data.items():
csv_out.writerow([key] + list(val))
这可能是 pandas 的工作,Python 数据分析库:
import pandas as pd
x1 = pd.read_csv('x1.csv')
x2 = pd.read_csv('x2.csv')
result = pd.merge(x1, x2, on='number')
result.to_csv('result.csv',
index=False,
columns=['Last Name', 'First Name', 'number', 'quiz'])
参考:https://chrisalbon.com/python/pandas_join_merge_dataframe.html
我认为以下方法可行。 注意:我已经删除了您问题中未使用的代码中的所有内容(正如您在发布之前应该做的那样)。我还硬编码了用于测试的输入值。
import csv
source = 'source1.csv'
target = 'target1.csv'
newtarg = 'new_output.csv'
targets = {}
with open(target) as file:
for row in csv.DictReader(file):
targets[row['number']] = row['quiz']
with open(source) as src, open(newtarg, 'w') as out:
reader = csv.DictReader(src)
writer = csv.writer(out)
writer.writerow(reader.fieldnames + ['quiz']) # create a header row (optional)
for row in reader:
row.update({'quiz': targets.get(row['Number'], 'no match')})
writer.writerow(row.values())