使用字典替换文本文件中的单词
Replacing words in text file using a dictionary
我试图打开一个文本文件,然后通读它,用存储在字典中的字符串替换某些字符串。
根据对 How do I edit a text file in Python? 的回答,我可以在进行替换之前提取字典值,但循环遍历字典似乎更有效。
代码没有产生任何错误,也没有做任何替换。
import fileinput
text = "sample file.txt"
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}
for line in fileinput.input(text, inplace=True):
line = line.rstrip()
for i in fields:
for field in fields:
field_value = fields[field]
if field in line:
line = line.replace(field, field_value)
print line
import fileinput
text = "sample file.txt"
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}
for line in fileinput.input(text, inplace=True):
line = line.rstrip()
for field in fields:
if field in line:
line = line.replace(field, fields[field])
print line
如果您对Python比较熟悉,可以使用官方文档中的提示:
7.1. string — Common string operations
和 subclass,模板 class,你可以在其中定义每一个世界将是一个新的 placeholder,然后使用 safe_substitute()
您可以获得一个不错且可靠的解决方案。
我会这样做:
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}
with open('yourfile.txt', 'w+') as f:
s = f.read()
for key in fields:
s = s.replace(key, fields[key])
f.write(s)
我用 items()
遍历了你的 fields
字典的 key
和 values
。
我用 continue
跳过空白行并用 rstrip()
清理其他行
我用你的 fields
字典中的 values
替换了 line
中找到的每个 keys
,并且我用 print
写了每一行。 =23=]
import fileinput
text = "sample file.txt"
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}
for line in fileinput.input(text, inplace=True):
line = line.rstrip()
if not line:
continue
for f_key, f_value in fields.items():
if f_key in line:
line = line.replace(f_key, f_value)
print line
如果您能找到覆盖所有键的正则表达式模式,您可以使用 re.sub
以获得非常有效的解决方案:您只需要一次通过,而不是为每个搜索词解析整个文本。
在您的标题中,您提到了 "replacing words"。在那种情况下,'\w+'
就可以正常工作。
import re
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}
words_to_replace = r'\bpattern \d+\b'
text = """Based on answers to How do I edit a text file in Python? pattern 1 I could pull out
the dictionary values before doing the replacing, but looping through the dictionary seems more efficient.
Test pattern 2
The code doesn't produce any errors, but also doesn't do any replacing. pattern 3"""
def replace_words_using_dict(matchobj):
key = matchobj.group(0)
return fields.get(key, key)
print(re.sub(words_to_replace, replace_words_using_dict, text))
它输出:
Based on answers to How do I edit a text file in Python? replacement text 1 I could pull out
the dictionary values before doing the replacing, but looping through the dictionary seems more efficient.
Test replacement text 2
The code doesn't produce any errors, but also doesn't do any replacing. pattern 3
另外,就地修改文件时要非常小心。我建议你写第二个文件替换。一旦您 100% 确定它可以完美运行,您可以切换到 inplace=True
.
刚刚弄清楚如何通过遍历字典(仅匹配整个单词)一次性替换 txt 文件中的大量不同单词。
如果我想用 "John" 替换“1”,但最终会把“12”变成 "John2.",那会很烦人 下面的代码对我有用。
import re
match = {} # create a dictionary of words-to-replace and words-to-replace-with
f = open("filename","r")
data = f.read() # string of all file content
def replace_all(text, dic):
for i, j in dic.items():
text = re.sub(r"\b%s\b"%i, j, text)
# r"\b%s\b"% enables replacing by whole word matches only
return text
data = replace_all(data,match)
print(data) # you can copy and paste the result to whatever file you like
我试图打开一个文本文件,然后通读它,用存储在字典中的字符串替换某些字符串。
根据对 How do I edit a text file in Python? 的回答,我可以在进行替换之前提取字典值,但循环遍历字典似乎更有效。
代码没有产生任何错误,也没有做任何替换。
import fileinput
text = "sample file.txt"
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}
for line in fileinput.input(text, inplace=True):
line = line.rstrip()
for i in fields:
for field in fields:
field_value = fields[field]
if field in line:
line = line.replace(field, field_value)
print line
import fileinput
text = "sample file.txt"
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}
for line in fileinput.input(text, inplace=True):
line = line.rstrip()
for field in fields:
if field in line:
line = line.replace(field, fields[field])
print line
如果您对Python比较熟悉,可以使用官方文档中的提示:
7.1. string — Common string operations
和 subclass,模板 class,你可以在其中定义每一个世界将是一个新的 placeholder,然后使用 safe_substitute()
您可以获得一个不错且可靠的解决方案。
我会这样做:
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}
with open('yourfile.txt', 'w+') as f:
s = f.read()
for key in fields:
s = s.replace(key, fields[key])
f.write(s)
我用 items()
遍历了你的 fields
字典的 key
和 values
。
我用 continue
跳过空白行并用 rstrip()
我用你的 fields
字典中的 values
替换了 line
中找到的每个 keys
,并且我用 print
写了每一行。 =23=]
import fileinput
text = "sample file.txt"
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}
for line in fileinput.input(text, inplace=True):
line = line.rstrip()
if not line:
continue
for f_key, f_value in fields.items():
if f_key in line:
line = line.replace(f_key, f_value)
print line
如果您能找到覆盖所有键的正则表达式模式,您可以使用 re.sub
以获得非常有效的解决方案:您只需要一次通过,而不是为每个搜索词解析整个文本。
在您的标题中,您提到了 "replacing words"。在那种情况下,'\w+'
就可以正常工作。
import re
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}
words_to_replace = r'\bpattern \d+\b'
text = """Based on answers to How do I edit a text file in Python? pattern 1 I could pull out
the dictionary values before doing the replacing, but looping through the dictionary seems more efficient.
Test pattern 2
The code doesn't produce any errors, but also doesn't do any replacing. pattern 3"""
def replace_words_using_dict(matchobj):
key = matchobj.group(0)
return fields.get(key, key)
print(re.sub(words_to_replace, replace_words_using_dict, text))
它输出:
Based on answers to How do I edit a text file in Python? replacement text 1 I could pull out
the dictionary values before doing the replacing, but looping through the dictionary seems more efficient.
Test replacement text 2
The code doesn't produce any errors, but also doesn't do any replacing. pattern 3
另外,就地修改文件时要非常小心。我建议你写第二个文件替换。一旦您 100% 确定它可以完美运行,您可以切换到 inplace=True
.
刚刚弄清楚如何通过遍历字典(仅匹配整个单词)一次性替换 txt 文件中的大量不同单词。 如果我想用 "John" 替换“1”,但最终会把“12”变成 "John2.",那会很烦人 下面的代码对我有用。
import re
match = {} # create a dictionary of words-to-replace and words-to-replace-with
f = open("filename","r")
data = f.read() # string of all file content
def replace_all(text, dic):
for i, j in dic.items():
text = re.sub(r"\b%s\b"%i, j, text)
# r"\b%s\b"% enables replacing by whole word matches only
return text
data = replace_all(data,match)
print(data) # you can copy and paste the result to whatever file you like