删除包含 2 个单词的引号并删除它们之间的逗号

Remove quotes holding 2 words and remove comma between them

跟进 Python to replace a symbol between between 2 words in a quote

扩展输入和预期输出:

试图用 &[=46= 替换第二行中两个词 Durango 和 PC 之间的 comma ] 然后也删除引号 "。与 Orbis 和 PC 第四行 的第三行相同,我会在引号中包含 2 个单词组合喜欢处理 "AAA - Character Tech, SOF - UPIs","Durango, Orbis, PC"

我想使用 Python 保留其余的行。

输入

2,SIN-Rendering,Core Tech - Rendering,PC,147,Reopened
2,Kenny Chong,Core Tech - Rendering,"Durango, PC",55,Reopened
3,SIN-Audio,AAA - Audio,"Orbis, PC",13,Open
LTY-168499,[PC][PS4][XB1] Missing textures from Fort Capture NPC face,3,CTU-CharacterTechBacklog,"AAA - Character Tech, SOF - UPIs","Durango, Orbis, PC",29,Waiting For
...
... 
...

像这样,我的示例中可以有 100 行。所以预期输出是:

2,SIN-Rendering,Core Tech - Rendering,PC,147,Reopened
2,Kenny Chong,Core Tech - Rendering, Durango & PC,55,Reopened
3,SIN-Audio,AAA - Audio, Orbis & PC,13,Open
LTY-168499,[PC][PS4][XB1] Missing textures from Fort Capture NPC face,3,CTU-CharacterTechBacklog,AAA - Character Tech & SOF - UPIs,Durango, Orbis & PC,29,Waiting For
...
...
...

到目前为止,我可以考虑逐行阅读,然后如果该行包含引号,则将其替换为无字符,但是替换里面的符号是我遇到的问题。

这是我现在拥有的:

for line in lines:
            expr2 =  re.findall('"(.*?)"', line)
            if len(expr2)!=0:
                expr3 = re.split('"',line)
                expr4 = expr3[0]+expr3[1].replace(","," &")+expr3[2]
                print >>k, expr4
            else:
                print >>k, line

但是没有考虑第4行的情况?也可以有超过 3 个组合。例如

3,SIN-Audio,"AAA - Audio, xxxx, yyyy","Orbis, PC","13, 22",Open 

并希望做这个 3,SIN-Audio,AAA - Audio & xxxx & yyyy, Orbis & PC, 13 & 22,Open

如何实现,有什么建议吗?学习Python.

因此,通过将输入文件视为 .csv 我们可以轻松地将这些行变成易于使用的内容。

例如,

2,Kenny Chong,Core Tech - Rendering, Durango & PC,55,Reopened

读作:

['2', 'Kenny Chong', 'Core Tech - Rendering', 'Durango, PC', '55', 'Reopened']

然后,通过将 , 的所有实例替换为 _& (space),我们将得到以下行:

['2', 'Kenny Chong', 'Core Tech - Rendering', 'Durango & PC', '55', 'Reopened']

并且它在一行中替换了多个 ,s 的实例,最后写入时我们不再有原来的双引号。

这是代码,假设 in.txt 是您的输入文件,它将写入 out.txt

import csv

with open('in.txt') as infile:
    reader = csv.reader(infile)

    with open('out.txt', 'w') as outfile:
        for line in reader:
            line = list(map(lambda s: s.replace(',', ' &'), line))
            outfile.write(','.join(line) + '\n')

第四行输出为:

LTY-168499,[PC][PS4][XB1] Missing textures from Fort Capture NPC face,3,CTU-CharacterTechBacklog,AAA - Character Tech & SOF - UPIs,Durango & Orbis & PC,29,Waiting For

请检查一次:我找不到可以做到这一点的单个表达式。所以它以一种有点复杂的方式做到了。如果我能找到更好的方法会更新(Python 3)

import re
st = "3,SIN-Audio,\"AAA - Audio, xxxx, yyyy\",\"Orbis, PC\",\"13, 22\",Open"
found = re.findall(r'\"(.*)\"',st)[0].split("\",\"")
final = ""
for word in found:
    final = final + (" &").join(word.split(","))+","
result = re.sub(r'\"(.*)\"',final[:-1],st)
print(result)