并非列中的所有数据都被复制到另一个 csv 文件

Not all data in a column is being copied to another csv file

所以我有两个 csv 文件。一种格式如下:

last name, first name, Number

另一种是这种格式:

number, quiz

我想创建一个新的输出文件,它采用这两个 csv 文件并给我一个格式如下的文件:

last name, first name, number, quiz. 

我已经尝试了以下代码并且它有效,但仅适用于前两个输入文件中列出的第一个人。我不确定我做错了什么。另外,我不想假设两个输入文件遵循相同的顺序。

import sys, re
import numpy as np
import smtplib
from random import randint
import csv
import math

col = sys.argv[1]
source = sys.argv[2]
target = sys.argv[3]
newtarg = sys.argv[4]


input_source = csv.DictReader(open(source))
input_target = csv.DictReader(open(target))
data = {}
t = ()

for row in input_target:
    t = row['First Name'], row['number']
    for rows in input_source:
        if rows['number'] == row['number']:
            t = t + (rows[col],)
            name = row['Last Name']
            data[name] = [t]
            rows.next()
        row.next()


with open(newtarg,'w') as out:
    csv_out=csv.writer(out)
    for key, val in data.items():
        csv_out.writerow([key] + list(val))

这可能是 pandas 的工作,Python 数据分析库:

import pandas as pd

x1 = pd.read_csv('x1.csv')
x2 = pd.read_csv('x2.csv')
result = pd.merge(x1, x2, on='number')
result.to_csv('result.csv',
              index=False,
              columns=['Last Name', 'First Name', 'number', 'quiz'])

参考:https://chrisalbon.com/python/pandas_join_merge_dataframe.html

我认为以下方法可行。 注意:我已经删除了您问题中未使用的代码中的所有内容(正如您在发布之前应该做的那样)。我还硬编码了用于测试的输入值。

import csv

source = 'source1.csv'
target = 'target1.csv'
newtarg = 'new_output.csv'

targets = {}
with open(target) as file:
    for row in csv.DictReader(file):
        targets[row['number']] = row['quiz']

with open(source) as src, open(newtarg, 'w') as out:
    reader = csv.DictReader(src)
    writer = csv.writer(out)
    writer.writerow(reader.fieldnames + ['quiz'])  # create a header row (optional)
    for row in reader:
        row.update({'quiz': targets.get(row['Number'], 'no match')})
        writer.writerow(row.values())