python 编写程序迭代 csv 文件以匹配字段并将结果保存在不同的数据文件中

python writing program to iterate a csv file to match field and save the result in a different data file

我正在尝试编写一个程序来执行以下操作:

从名为 data 的 csv 文件中的记录中指定一个字段。 从名为 log 的 csv 文件中的记录中指定一个字段。

比较两者在数据和日志中的位置。如果它们在同一行,则继续将名为 log 的文件中的记录写入名为 result 的新文件中。 如果该字段与日志文件中的记录位置不匹配,则继续移动到日志文件中的下一条记录并进行比较,直到找到匹配的记录,然后将该记录保存在名为 result 的文件中。 重置日志文件的索引 转到数据文件中的下一行并继续进行验证,直到数据文件到达末尾。

这是我能够做到的,但我被卡住了

import csv
def main():

    datafile_csv = open('data.txt')
    logfile_csv = open('log.txt')
    row_data = []
    row_log = []
    row_log_temp = []
    index_data = 1
    index_log = 1
    index_log_temp = index_log
    counter = 0
    data = ''
    datareader = ''
    logreader = ''
    log = ''
#   row = 0
    logfile_len = sum (1 for lines in open('log.txt'))
    with open('resultfile.csv','w') as csvfile:
        out_write = csv.writer(csvfile,  delimiter=',',quotechar='"')
        with open('data.txt','r') as (data):
            row_data = csv.reader(csvfile, delimiter=',', quotechar='"')
            row_data = next(data)
            print(row_data)
            with open ('log.txt','r') as (log):
                row_log = next(log)
                print(row_log)
                while counter != logfile_len:
                    comp_data = row_data[index_data:]
                    comp_log = row_log[index_log:]
                    comp_data = comp_data.strip('"')
                    comp_log = comp_log.strip('"')
                    print(row_data[1])
                    print(comp_data)
                    print(comp_log)
                    if comp_data != comp_log:
                        while comp_data != comp_log:
                            row_log = next(log)
                            comp_log = row_log[index_log]
                        out_write.writerow(row_log)
                        row_data = next(data)
                    else : 
                        out_write.writerow(row_log)
                        row_data = next(data)
                    log.seek(0)
                    counter +=1

我遇到的问题如下:

我无法正确转换字符串中的数据行,也无法正确比较。

我还需要能够重置日志文件中的指针,但搜索似乎不起作用....

这是数据文件的内容

"test1","test2","test3" “1”、“2”、“3” "4","5","6"

这是日志文件的内容

"test1","test2","test3" “4”、“5”、“6” "1","2","3"

这就是编译器return我

吨 "test1","test2","test3"

吨 测试 1","test2","test3"

test1","test2","test3"

1 1","2","3"

test1","test2","test3"

Traceback (most recent call last):
File "H:/test.py", line 100, in <module>
main()
File "H:/test.py", line 40, in main
comp_log = row_log[index_log]
IndexError: string index out of range

非常感谢您的帮助

此致

达尼洛

按列(行数和特定列[未定义])连接两个文件,并返回仅限于 left/first 文件的列的结果。

import petl

log = petl.fromcsv('log.txt').addrownumbers()  # Load csv/txt file into PETL table, and add row numbers 
log_columns = len(petl.header(log))  # Get the amount of columns in the log file
data = petl.fromcsv('data.txt').addrownumbers()  # Load csv/txt file into PETL table, and add row numbers 
joined_files = petl.join(log, data, key=['row', 'SpecificField'])  # Join the tables using row and a specific field
joined_files = petl.cut(joined_files, *range(1, log_columns))  # Remove the extra columns obtained from right table
petl.tocsv(joined_files, 'resultfile.csv')  # Output results to csv file

log.txt

data.txt

resultfile.csv

另外不要忘记 pip install(本例使用的版本):

pip install petl==1.0.11