python 编写程序迭代 csv 文件以匹配字段并将结果保存在不同的数据文件中
python writing program to iterate a csv file to match field and save the result in a different data file
我正在尝试编写一个程序来执行以下操作:
从名为 data 的 csv 文件中的记录中指定一个字段。
从名为 log 的 csv 文件中的记录中指定一个字段。
比较两者在数据和日志中的位置。如果它们在同一行,则继续将名为 log 的文件中的记录写入名为 result 的新文件中。
如果该字段与日志文件中的记录位置不匹配,则继续移动到日志文件中的下一条记录并进行比较,直到找到匹配的记录,然后将该记录保存在名为 result 的文件中。
重置日志文件的索引
转到数据文件中的下一行并继续进行验证,直到数据文件到达末尾。
这是我能够做到的,但我被卡住了
import csv
def main():
datafile_csv = open('data.txt')
logfile_csv = open('log.txt')
row_data = []
row_log = []
row_log_temp = []
index_data = 1
index_log = 1
index_log_temp = index_log
counter = 0
data = ''
datareader = ''
logreader = ''
log = ''
# row = 0
logfile_len = sum (1 for lines in open('log.txt'))
with open('resultfile.csv','w') as csvfile:
out_write = csv.writer(csvfile, delimiter=',',quotechar='"')
with open('data.txt','r') as (data):
row_data = csv.reader(csvfile, delimiter=',', quotechar='"')
row_data = next(data)
print(row_data)
with open ('log.txt','r') as (log):
row_log = next(log)
print(row_log)
while counter != logfile_len:
comp_data = row_data[index_data:]
comp_log = row_log[index_log:]
comp_data = comp_data.strip('"')
comp_log = comp_log.strip('"')
print(row_data[1])
print(comp_data)
print(comp_log)
if comp_data != comp_log:
while comp_data != comp_log:
row_log = next(log)
comp_log = row_log[index_log]
out_write.writerow(row_log)
row_data = next(data)
else :
out_write.writerow(row_log)
row_data = next(data)
log.seek(0)
counter +=1
我遇到的问题如下:
我无法正确转换字符串中的数据行,也无法正确比较。
我还需要能够重置日志文件中的指针,但搜索似乎不起作用....
这是数据文件的内容
"test1","test2","test3"
“1”、“2”、“3”
"4","5","6"
这是日志文件的内容
"test1","test2","test3"
“4”、“5”、“6”
"1","2","3"
这就是编译器return我
吨
"test1","test2","test3"
吨
测试 1","test2","test3"
test1","test2","test3"
1
1","2","3"
test1","test2","test3"
Traceback (most recent call last):
File "H:/test.py", line 100, in <module>
main()
File "H:/test.py", line 40, in main
comp_log = row_log[index_log]
IndexError: string index out of range
非常感谢您的帮助
此致
达尼洛
按列(行数和特定列[未定义])连接两个文件,并返回仅限于 left/first 文件的列的结果。
import petl
log = petl.fromcsv('log.txt').addrownumbers() # Load csv/txt file into PETL table, and add row numbers
log_columns = len(petl.header(log)) # Get the amount of columns in the log file
data = petl.fromcsv('data.txt').addrownumbers() # Load csv/txt file into PETL table, and add row numbers
joined_files = petl.join(log, data, key=['row', 'SpecificField']) # Join the tables using row and a specific field
joined_files = petl.cut(joined_files, *range(1, log_columns)) # Remove the extra columns obtained from right table
petl.tocsv(joined_files, 'resultfile.csv') # Output results to csv file
log.txt
data.txt
resultfile.csv
另外不要忘记 pip install(本例使用的版本):
pip install petl==1.0.11
我正在尝试编写一个程序来执行以下操作:
从名为 data 的 csv 文件中的记录中指定一个字段。 从名为 log 的 csv 文件中的记录中指定一个字段。
比较两者在数据和日志中的位置。如果它们在同一行,则继续将名为 log 的文件中的记录写入名为 result 的新文件中。 如果该字段与日志文件中的记录位置不匹配,则继续移动到日志文件中的下一条记录并进行比较,直到找到匹配的记录,然后将该记录保存在名为 result 的文件中。 重置日志文件的索引 转到数据文件中的下一行并继续进行验证,直到数据文件到达末尾。
这是我能够做到的,但我被卡住了
import csv
def main():
datafile_csv = open('data.txt')
logfile_csv = open('log.txt')
row_data = []
row_log = []
row_log_temp = []
index_data = 1
index_log = 1
index_log_temp = index_log
counter = 0
data = ''
datareader = ''
logreader = ''
log = ''
# row = 0
logfile_len = sum (1 for lines in open('log.txt'))
with open('resultfile.csv','w') as csvfile:
out_write = csv.writer(csvfile, delimiter=',',quotechar='"')
with open('data.txt','r') as (data):
row_data = csv.reader(csvfile, delimiter=',', quotechar='"')
row_data = next(data)
print(row_data)
with open ('log.txt','r') as (log):
row_log = next(log)
print(row_log)
while counter != logfile_len:
comp_data = row_data[index_data:]
comp_log = row_log[index_log:]
comp_data = comp_data.strip('"')
comp_log = comp_log.strip('"')
print(row_data[1])
print(comp_data)
print(comp_log)
if comp_data != comp_log:
while comp_data != comp_log:
row_log = next(log)
comp_log = row_log[index_log]
out_write.writerow(row_log)
row_data = next(data)
else :
out_write.writerow(row_log)
row_data = next(data)
log.seek(0)
counter +=1
我遇到的问题如下:
我无法正确转换字符串中的数据行,也无法正确比较。
我还需要能够重置日志文件中的指针,但搜索似乎不起作用....
这是数据文件的内容
"test1","test2","test3" “1”、“2”、“3” "4","5","6"
这是日志文件的内容
"test1","test2","test3" “4”、“5”、“6” "1","2","3"
这就是编译器return我
吨 "test1","test2","test3"
吨 测试 1","test2","test3"
test1","test2","test3"
1 1","2","3"
test1","test2","test3"
Traceback (most recent call last):
File "H:/test.py", line 100, in <module>
main()
File "H:/test.py", line 40, in main
comp_log = row_log[index_log]
IndexError: string index out of range
非常感谢您的帮助
此致
达尼洛
按列(行数和特定列[未定义])连接两个文件,并返回仅限于 left/first 文件的列的结果。
import petl
log = petl.fromcsv('log.txt').addrownumbers() # Load csv/txt file into PETL table, and add row numbers
log_columns = len(petl.header(log)) # Get the amount of columns in the log file
data = petl.fromcsv('data.txt').addrownumbers() # Load csv/txt file into PETL table, and add row numbers
joined_files = petl.join(log, data, key=['row', 'SpecificField']) # Join the tables using row and a specific field
joined_files = petl.cut(joined_files, *range(1, log_columns)) # Remove the extra columns obtained from right table
petl.tocsv(joined_files, 'resultfile.csv') # Output results to csv file
log.txt
data.txt
resultfile.csv
另外不要忘记 pip install(本例使用的版本):
pip install petl==1.0.11