如何比较 2 个 txt 文件的差异并输出到新的 txt 文件
how to compare difference in 2 txt files and output to a new txt file
如何比较 2 个 txt 文件的差异并输出并打印到 shell?
此 link
中的工作文件
当然用diff
使用 drop_duplicates
和 Pandas:
df1 = pd.read_csv('members_1.txt', header=None).drop_duplicates()
df2 = pd.read_csv('members_2.txt', header=None).drop_duplicates()
out = pd.concat([df1, df2]).drop_duplicates(keep=False)
输出
>> print(*out[0].to_list(), sep='\n')
LEE RI KE
LIM YONG
KOH CHEE KIAT
LEE YONG
KOH CHEW KIAT
LEE RI KHEE
或
在Python中使用set
:
with open('members_1.txt') as fp1, open('members_2.txt') as fp2:
data1 = set([l.strip() for l in fp1])
data2 = set([l.strip() for l in fp2])
out = data1.symmetric_difference(data2)
输出:
>>> print(*out, sep='\n')
KOH CHEW KIAT
LEE RI KE
LEE YONG
KOH CHEE KIAT
LEE RI KHEE
LIM YONG
更新:导出到文件
with open('output.txt', 'w') as fp:
print(*out, sep='\n', file=fp)
如何比较 2 个 txt 文件的差异并输出并打印到 shell?
此 link
中的工作文件当然用diff
使用 drop_duplicates
和 Pandas:
df1 = pd.read_csv('members_1.txt', header=None).drop_duplicates()
df2 = pd.read_csv('members_2.txt', header=None).drop_duplicates()
out = pd.concat([df1, df2]).drop_duplicates(keep=False)
输出
>> print(*out[0].to_list(), sep='\n')
LEE RI KE
LIM YONG
KOH CHEE KIAT
LEE YONG
KOH CHEW KIAT
LEE RI KHEE
或
在Python中使用set
:
with open('members_1.txt') as fp1, open('members_2.txt') as fp2:
data1 = set([l.strip() for l in fp1])
data2 = set([l.strip() for l in fp2])
out = data1.symmetric_difference(data2)
输出:
>>> print(*out, sep='\n')
KOH CHEW KIAT
LEE RI KE
LEE YONG
KOH CHEE KIAT
LEE RI KHEE
LIM YONG
更新:导出到文件
with open('output.txt', 'w') as fp:
print(*out, sep='\n', file=fp)