在包含特定字符串的行中用单个 space 替换双 space
Replacing double space with single space in line containing certain string
我有一个包含行和列的大型文本文件。在文件中的所有 strings/data 之间,有一个双 space。但是,为了让我的特定代码正常工作,我需要双 spaces 仅在某些行中变为单个 spaces。这些行都以相同的字符串开头。
我试过:
with open(outfile) as f3, open(outfile2,'w') as f4:
for line in f3:
line = line.strip()
if "SAMPLE" in line:
" ".join(line.split())
if 'xyz' not in line and len(line) >=46:
f4.write(line+'\n')
我试过了:
import re
with open(outfile) as f3, open(outfile2,'w') as f4:
for line in f3:
if "SAMPLE" in line:
re.sub("\s\s+" , " ", line)
if 'xyz' not in line and len(line) >=46:
f4.write(line)
都不行。第二个 if 语句删除一些我不想要的行,这样就不会消失(这按预期工作)。但是,文本文件中所有数据之间的双倍间距仍然存在。我怎样才能使文件中包含 "SAMPLE" 的行用单间距替换行中单词之间的双 spaces?
试试这个:
s = " ".join(your_string.split())
你的问题是字符串的可变性," ".join(line.split())
创建了一个新字符串,这很可能是你需要的,但你应该将它分配回 line
变量。
if "SAMPLE" in line:
line = " ".join(line.split())
稍后编辑:
第二个 if
有点 "strange" ...预期的结果是什么?
if not line or (':' and len(line) >=46):
f4.write(line)
尤其是第二部分... ':'
总是评估为 True
,看起来没用,可能是打字错误或遗漏了什么。
仅当 line
为空或 None(计算结果为 False
)或行的长度为 >=
46.
时,这才会写入文件
代码应如下所示:
with open(outfile) as f3, open(outfile2,'w') as f4:
for line in f3:
line = line.strip()
if "SAMPLE" in line:
# we clean eventual double/multi-space if the line contains "SAMPLE"
line = " ".join(line.split())
if 'xyz' not in line and len(line) >=46:
# write to the second file only the lines that
# don't contain 'xyz' and have the length of the line => 46
f4.write(line+'\n')
我有一个包含行和列的大型文本文件。在文件中的所有 strings/data 之间,有一个双 space。但是,为了让我的特定代码正常工作,我需要双 spaces 仅在某些行中变为单个 spaces。这些行都以相同的字符串开头。
我试过:
with open(outfile) as f3, open(outfile2,'w') as f4:
for line in f3:
line = line.strip()
if "SAMPLE" in line:
" ".join(line.split())
if 'xyz' not in line and len(line) >=46:
f4.write(line+'\n')
我试过了:
import re
with open(outfile) as f3, open(outfile2,'w') as f4:
for line in f3:
if "SAMPLE" in line:
re.sub("\s\s+" , " ", line)
if 'xyz' not in line and len(line) >=46:
f4.write(line)
都不行。第二个 if 语句删除一些我不想要的行,这样就不会消失(这按预期工作)。但是,文本文件中所有数据之间的双倍间距仍然存在。我怎样才能使文件中包含 "SAMPLE" 的行用单间距替换行中单词之间的双 spaces?
试试这个:
s = " ".join(your_string.split())
你的问题是字符串的可变性," ".join(line.split())
创建了一个新字符串,这很可能是你需要的,但你应该将它分配回 line
变量。
if "SAMPLE" in line:
line = " ".join(line.split())
稍后编辑:
第二个 if
有点 "strange" ...预期的结果是什么?
if not line or (':' and len(line) >=46):
f4.write(line)
尤其是第二部分... ':'
总是评估为 True
,看起来没用,可能是打字错误或遗漏了什么。
仅当 line
为空或 None(计算结果为 False
)或行的长度为 >=
46.
代码应如下所示:
with open(outfile) as f3, open(outfile2,'w') as f4:
for line in f3:
line = line.strip()
if "SAMPLE" in line:
# we clean eventual double/multi-space if the line contains "SAMPLE"
line = " ".join(line.split())
if 'xyz' not in line and len(line) >=46:
# write to the second file only the lines that
# don't contain 'xyz' and have the length of the line => 46
f4.write(line+'\n')