找巧合加列
Find coincidence and add column
我想完成这个特定任务,我有 2 个文件,第一个包含电子邮件和凭据:
xavier.desprez@william.com:Xavier
xavier.locqueneux@william.com:vocojydu
xaviere.chevry@pepe.com:voluzigy
Xavier.Therin@william.com:Pussycat5
xiomara.rivera@william.com:xrhj1971
xiomara.rivera@william-honduras.william.com:xrhj1971
第二个,包含电子邮件地址和位置:
xavier.desprez@william.com:BOSNIA
xaviere.chevry@pepe.com:ROMANIA
我希望,每当在第二个文件中找到来自第一个文件的电子邮件时,该行就会被替换为 EMAIL:CREDENTIAL:LOCATION ,当找不到时,它最终是:EMAIL:CREDENTIAL:BLANK
所以最终的文件必须是这样的:
xavier.desprez@william.com:Xavier:BOSNIA
xavier.locqueneux@william.com:vocojydu:BLANK
xaviere.chevry@pepe.com:voluzigy:ROMANIA
Xavier.Therin@william.com:Pussycat5:BLANK
xiomara.rivera@william.com:xrhj1971:BLANK
xiomara.rivera@william-honduras.william.com:xrhj1971:BLANK
我在 python 中做了几次尝试,但写它甚至不值得,因为我离解决方案还不是很近。
此致!
编辑:
这是我试过的:
import os
import sys
with open("test.txt", "r") as a_file:
for line_a in a_file:
stripped_email_a = line_a.strip().split(':')[0]
with open("location.txt", "r") as b_file:
for line_b in b_file:
stripped_email_b = line_b.strip().split(':')[0]
location = line_b.strip().split(':')[1]
if stripped_email_a == stripped_email_b:
a = line_a + ":" + location
print(a.replace("\n",""))
else:
b = line_a + ":BLANK"
print (b.replace("\n",""))
这是我得到的结果:
xavier.desprez@william.com:Xavier:BOSNIA
xavier.desprez@william.com:Xavier:BLANK
xaviere.chevry@pepe.com:voluzigy:BLANK
xaviere.chevry@pepe.com:voluzigy:ROMANIA
xavier.locqueneux@william.com:vocojydu:BLANK
xavier.locqueneux@william.com:vocojydu:BLANK
Xavier.Therin@william.com:Pussycat5:BLANK
Xavier.Therin@william.com:Pussycat5:BLANK
xiomara.rivera@william.com:xrhj1971:BLANK
xiomara.rivera@william.com:xrhj1971:BLANK
xiomara.rivera@william-honduras.william.com:xrhj1971:BLANK
xiomara.rivera@william-honduras.william.com:xrhj1971:BLANK
我非常接近,但我得到了重复的 ;)
此致
重复问题是因为您正在以嵌套方式读取两个文件,一旦读取了 test.txt
中的一行,您就打开 location.txt
文件进行读取并处理它.然后,你从test.txt
中读取第二行,然后重新打开location.txt
并再次处理它。
相反,从 location.txt
中获取所有必要的数据,例如,放入字典中,然后在阅读 test.txt
:
时使用它
email_loc_dict = {}
with open("location.txt", "r") as b_file:
for line_b in b_file:
splits = line_b.strip().split(':')
email_loc_dict[splits[0]] = splits[1]
with open("test.txt", "r") as a_file:
for line_a in a_file:
line_a = line_a.strip()
stripped_email_a = line_a.split(':')[0]
if stripped_email_a in email_loc_dict:
a = line_a + ":" + email_loc_dict[stripped_email_a]
print(a)
else:
b = line_a + ":BLANK"
print(b)
输出:
xavier.desprez@william.com:Xavier:BOSNIA
xavier.locqueneux@william.com:vocojydu:BLANK
xaviere.chevry@pepe.com:voluzigy:ROMANIA
Xavier.Therin@william.com:Pussycat5:BLANK
xiomara.rivera@william.com:xrhj1971:BLANK
xiomara.rivera@william-honduras.william.com:xrhj1971:BLANK
我想完成这个特定任务,我有 2 个文件,第一个包含电子邮件和凭据:
xavier.desprez@william.com:Xavier
xavier.locqueneux@william.com:vocojydu
xaviere.chevry@pepe.com:voluzigy
Xavier.Therin@william.com:Pussycat5
xiomara.rivera@william.com:xrhj1971
xiomara.rivera@william-honduras.william.com:xrhj1971
第二个,包含电子邮件地址和位置:
xavier.desprez@william.com:BOSNIA
xaviere.chevry@pepe.com:ROMANIA
我希望,每当在第二个文件中找到来自第一个文件的电子邮件时,该行就会被替换为 EMAIL:CREDENTIAL:LOCATION ,当找不到时,它最终是:EMAIL:CREDENTIAL:BLANK
所以最终的文件必须是这样的:
xavier.desprez@william.com:Xavier:BOSNIA
xavier.locqueneux@william.com:vocojydu:BLANK
xaviere.chevry@pepe.com:voluzigy:ROMANIA
Xavier.Therin@william.com:Pussycat5:BLANK
xiomara.rivera@william.com:xrhj1971:BLANK
xiomara.rivera@william-honduras.william.com:xrhj1971:BLANK
我在 python 中做了几次尝试,但写它甚至不值得,因为我离解决方案还不是很近。
此致!
编辑:
这是我试过的:
import os
import sys
with open("test.txt", "r") as a_file:
for line_a in a_file:
stripped_email_a = line_a.strip().split(':')[0]
with open("location.txt", "r") as b_file:
for line_b in b_file:
stripped_email_b = line_b.strip().split(':')[0]
location = line_b.strip().split(':')[1]
if stripped_email_a == stripped_email_b:
a = line_a + ":" + location
print(a.replace("\n",""))
else:
b = line_a + ":BLANK"
print (b.replace("\n",""))
这是我得到的结果:
xavier.desprez@william.com:Xavier:BOSNIA
xavier.desprez@william.com:Xavier:BLANK
xaviere.chevry@pepe.com:voluzigy:BLANK
xaviere.chevry@pepe.com:voluzigy:ROMANIA
xavier.locqueneux@william.com:vocojydu:BLANK
xavier.locqueneux@william.com:vocojydu:BLANK
Xavier.Therin@william.com:Pussycat5:BLANK
Xavier.Therin@william.com:Pussycat5:BLANK
xiomara.rivera@william.com:xrhj1971:BLANK
xiomara.rivera@william.com:xrhj1971:BLANK
xiomara.rivera@william-honduras.william.com:xrhj1971:BLANK
xiomara.rivera@william-honduras.william.com:xrhj1971:BLANK
我非常接近,但我得到了重复的 ;)
此致
重复问题是因为您正在以嵌套方式读取两个文件,一旦读取了 test.txt
中的一行,您就打开 location.txt
文件进行读取并处理它.然后,你从test.txt
中读取第二行,然后重新打开location.txt
并再次处理它。
相反,从 location.txt
中获取所有必要的数据,例如,放入字典中,然后在阅读 test.txt
:
email_loc_dict = {}
with open("location.txt", "r") as b_file:
for line_b in b_file:
splits = line_b.strip().split(':')
email_loc_dict[splits[0]] = splits[1]
with open("test.txt", "r") as a_file:
for line_a in a_file:
line_a = line_a.strip()
stripped_email_a = line_a.split(':')[0]
if stripped_email_a in email_loc_dict:
a = line_a + ":" + email_loc_dict[stripped_email_a]
print(a)
else:
b = line_a + ":BLANK"
print(b)
输出:
xavier.desprez@william.com:Xavier:BOSNIA
xavier.locqueneux@william.com:vocojydu:BLANK
xaviere.chevry@pepe.com:voluzigy:ROMANIA
Xavier.Therin@william.com:Pussycat5:BLANK
xiomara.rivera@william.com:xrhj1971:BLANK
xiomara.rivera@william-honduras.william.com:xrhj1971:BLANK