删除包含 2 个单词的引号并删除它们之间的逗号
Remove quotes holding 2 words and remove comma between them
跟进 Python to replace a symbol between between 2 words in a quote
扩展输入和预期输出:
试图用 &[=46= 替换第二行中两个词 Durango 和 PC 之间的 comma ] 然后也删除引号 "。与 Orbis 和 PC 和 第四行 的第三行相同,我会在引号中包含 2 个单词组合喜欢处理 "AAA - Character Tech, SOF - UPIs","Durango, Orbis, PC"
我想使用 Python 保留其余的行。
输入
2,SIN-Rendering,Core Tech - Rendering,PC,147,Reopened
2,Kenny Chong,Core Tech - Rendering,"Durango, PC",55,Reopened
3,SIN-Audio,AAA - Audio,"Orbis, PC",13,Open
LTY-168499,[PC][PS4][XB1] Missing textures from Fort Capture NPC face,3,CTU-CharacterTechBacklog,"AAA - Character Tech, SOF - UPIs","Durango, Orbis, PC",29,Waiting For
...
...
...
像这样,我的示例中可以有 100 行。所以预期输出是:
2,SIN-Rendering,Core Tech - Rendering,PC,147,Reopened
2,Kenny Chong,Core Tech - Rendering, Durango & PC,55,Reopened
3,SIN-Audio,AAA - Audio, Orbis & PC,13,Open
LTY-168499,[PC][PS4][XB1] Missing textures from Fort Capture NPC face,3,CTU-CharacterTechBacklog,AAA - Character Tech & SOF - UPIs,Durango, Orbis & PC,29,Waiting For
...
...
...
到目前为止,我可以考虑逐行阅读,然后如果该行包含引号,则将其替换为无字符,但是替换里面的符号是我遇到的问题。
这是我现在拥有的:
for line in lines:
expr2 = re.findall('"(.*?)"', line)
if len(expr2)!=0:
expr3 = re.split('"',line)
expr4 = expr3[0]+expr3[1].replace(","," &")+expr3[2]
print >>k, expr4
else:
print >>k, line
但是没有考虑第4行的情况?也可以有超过 3 个组合。例如
3,SIN-Audio,"AAA - Audio, xxxx, yyyy","Orbis, PC","13, 22",Open
并希望做这个
3,SIN-Audio,AAA - Audio & xxxx & yyyy, Orbis & PC, 13 & 22,Open
如何实现,有什么建议吗?学习Python.
因此,通过将输入文件视为 .csv
我们可以轻松地将这些行变成易于使用的内容。
例如,
2,Kenny Chong,Core Tech - Rendering, Durango & PC,55,Reopened
读作:
['2', 'Kenny Chong', 'Core Tech - Rendering', 'Durango, PC', '55', 'Reopened']
然后,通过将 ,
的所有实例替换为 _&
(space),我们将得到以下行:
['2', 'Kenny Chong', 'Core Tech - Rendering', 'Durango & PC', '55', 'Reopened']
并且它在一行中替换了多个 ,
s 的实例,最后写入时我们不再有原来的双引号。
这是代码,假设 in.txt
是您的输入文件,它将写入 out.txt
。
import csv
with open('in.txt') as infile:
reader = csv.reader(infile)
with open('out.txt', 'w') as outfile:
for line in reader:
line = list(map(lambda s: s.replace(',', ' &'), line))
outfile.write(','.join(line) + '\n')
第四行输出为:
LTY-168499,[PC][PS4][XB1] Missing textures from Fort Capture NPC face,3,CTU-CharacterTechBacklog,AAA - Character Tech & SOF - UPIs,Durango & Orbis & PC,29,Waiting For
请检查一次:我找不到可以做到这一点的单个表达式。所以它以一种有点复杂的方式做到了。如果我能找到更好的方法会更新(Python 3)
import re
st = "3,SIN-Audio,\"AAA - Audio, xxxx, yyyy\",\"Orbis, PC\",\"13, 22\",Open"
found = re.findall(r'\"(.*)\"',st)[0].split("\",\"")
final = ""
for word in found:
final = final + (" &").join(word.split(","))+","
result = re.sub(r'\"(.*)\"',final[:-1],st)
print(result)
跟进 Python to replace a symbol between between 2 words in a quote
扩展输入和预期输出:
试图用 &[=46= 替换第二行中两个词 Durango 和 PC 之间的 comma ] 然后也删除引号 "。与 Orbis 和 PC 和 第四行 的第三行相同,我会在引号中包含 2 个单词组合喜欢处理 "AAA - Character Tech, SOF - UPIs","Durango, Orbis, PC"
我想使用 Python 保留其余的行。
输入
2,SIN-Rendering,Core Tech - Rendering,PC,147,Reopened
2,Kenny Chong,Core Tech - Rendering,"Durango, PC",55,Reopened
3,SIN-Audio,AAA - Audio,"Orbis, PC",13,Open
LTY-168499,[PC][PS4][XB1] Missing textures from Fort Capture NPC face,3,CTU-CharacterTechBacklog,"AAA - Character Tech, SOF - UPIs","Durango, Orbis, PC",29,Waiting For
...
...
...
像这样,我的示例中可以有 100 行。所以预期输出是:
2,SIN-Rendering,Core Tech - Rendering,PC,147,Reopened
2,Kenny Chong,Core Tech - Rendering, Durango & PC,55,Reopened
3,SIN-Audio,AAA - Audio, Orbis & PC,13,Open
LTY-168499,[PC][PS4][XB1] Missing textures from Fort Capture NPC face,3,CTU-CharacterTechBacklog,AAA - Character Tech & SOF - UPIs,Durango, Orbis & PC,29,Waiting For
...
...
...
到目前为止,我可以考虑逐行阅读,然后如果该行包含引号,则将其替换为无字符,但是替换里面的符号是我遇到的问题。
这是我现在拥有的:
for line in lines:
expr2 = re.findall('"(.*?)"', line)
if len(expr2)!=0:
expr3 = re.split('"',line)
expr4 = expr3[0]+expr3[1].replace(","," &")+expr3[2]
print >>k, expr4
else:
print >>k, line
但是没有考虑第4行的情况?也可以有超过 3 个组合。例如
3,SIN-Audio,"AAA - Audio, xxxx, yyyy","Orbis, PC","13, 22",Open
并希望做这个
3,SIN-Audio,AAA - Audio & xxxx & yyyy, Orbis & PC, 13 & 22,Open
如何实现,有什么建议吗?学习Python.
因此,通过将输入文件视为 .csv
我们可以轻松地将这些行变成易于使用的内容。
例如,
2,Kenny Chong,Core Tech - Rendering, Durango & PC,55,Reopened
读作:
['2', 'Kenny Chong', 'Core Tech - Rendering', 'Durango, PC', '55', 'Reopened']
然后,通过将 ,
的所有实例替换为 _&
(space),我们将得到以下行:
['2', 'Kenny Chong', 'Core Tech - Rendering', 'Durango & PC', '55', 'Reopened']
并且它在一行中替换了多个 ,
s 的实例,最后写入时我们不再有原来的双引号。
这是代码,假设 in.txt
是您的输入文件,它将写入 out.txt
。
import csv
with open('in.txt') as infile:
reader = csv.reader(infile)
with open('out.txt', 'w') as outfile:
for line in reader:
line = list(map(lambda s: s.replace(',', ' &'), line))
outfile.write(','.join(line) + '\n')
第四行输出为:
LTY-168499,[PC][PS4][XB1] Missing textures from Fort Capture NPC face,3,CTU-CharacterTechBacklog,AAA - Character Tech & SOF - UPIs,Durango & Orbis & PC,29,Waiting For
请检查一次:我找不到可以做到这一点的单个表达式。所以它以一种有点复杂的方式做到了。如果我能找到更好的方法会更新(Python 3)
import re
st = "3,SIN-Audio,\"AAA - Audio, xxxx, yyyy\",\"Orbis, PC\",\"13, 22\",Open"
found = re.findall(r'\"(.*)\"',st)[0].split("\",\"")
final = ""
for word in found:
final = final + (" &").join(word.split(","))+","
result = re.sub(r'\"(.*)\"',final[:-1],st)
print(result)