文件中的正则表达式字符串替换
Regex string substitution in file
我需要替换文件中的字符串。但暂时没有成功。欢迎任何建议。
我有包含以下内容的文件 output.txt:
2021-07-28 10:27:49,869 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT <xml code following></xml code following>
2021-07-28 10:27:49,881 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT <xml code following></xml code following>
2021-07-28 10:27:51,834 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT <xml code following></xml code following>
2021-07-28 10:27:52,182 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT <xml code following></xml code following>
我有代码,用于制作每一行的第一部分:
2021-07-28 10:27:52,182 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT
拥有:
<time>2021-07-28 10:27:52,182 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT</time>
代码如下:
regex_time_xml_div = r"\d+-\d+-\d+ \d+:\d+:\d+,\d+\s[0-z]{7}\s[0-z]{9}\s.{34}"
with open(r'output\output.txt',"r+") as file:
list_of_timestamps = []
for line in file:
if re.search(regex_time_xml_div, str(line)):
list_of_timestamps.append(line)
content = file.read()
for i in list_of_timestamps:
result = re.sub(regex_time_xml_div,'<time>'+i+'</time>',content)
print(result,file=open(r'output\output_new.txt',"a"))
但是结果文件 output_new.txt 有 4 个空行。任何人都可以支持这一点。谢谢指教。
感谢 4Fingers,我已将代码更改为:
regex_time_xml_div_1 = r"^"
regex_time_xml_div_2 = r"\s<"
xml_time_1 = '<time>'
xml_time_2 = '</time><'
with open(r'output\output.txt',"r+") as file:
for line in file:
xml_time = re.sub(regex_time_xml_div_1, xml_time_1, line)
print(xml_time,file=open(r'output\beg_line.txt',"a"))
with open(r'output\new.txt',"r+") as file:
for line in file:
xml_time = re.sub(regex_time_xml_div_2, xml_time_2, line)
print(xml_time,file=open(r'output\after_time.txt',"a"))
就是这样,现在输出看起来像预期的那样。但是额外文件的数量看起来有点混乱。使用 os.remove
删除了这些
所以基本上答案是:
regex_time_xml_div_1 = r"^"
regex_time_xml_div_2 = r"\s<"
xml_time_1 = '<time>'
xml_time_2 = '</time><'
with open(r'output\output.txt',"r+") as file:
for line in file:
xml_time = re.sub(regex_time_xml_div_1, xml_time_1, line)
print(xml_time,file=open(r'output\beg_line.txt',"a"))
with open(r'output\new.txt',"r+") as file:
for line in file:
xml_time = re.sub(regex_time_xml_div_2, xml_time_2, line)
print(xml_time,file=open(r'output\after_time.txt',"a"))
非常感谢您的帮助:)
我需要替换文件中的字符串。但暂时没有成功。欢迎任何建议。
我有包含以下内容的文件 output.txt:
2021-07-28 10:27:49,869 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT <xml code following></xml code following>
2021-07-28 10:27:49,881 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT <xml code following></xml code following>
2021-07-28 10:27:51,834 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT <xml code following></xml code following>
2021-07-28 10:27:52,182 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT <xml code following></xml code following>
我有代码,用于制作每一行的第一部分:
2021-07-28 10:27:52,182 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT
拥有:
<time>2021-07-28 10:27:52,182 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT</time>
代码如下:
regex_time_xml_div = r"\d+-\d+-\d+ \d+:\d+:\d+,\d+\s[0-z]{7}\s[0-z]{9}\s.{34}"
with open(r'output\output.txt',"r+") as file:
list_of_timestamps = []
for line in file:
if re.search(regex_time_xml_div, str(line)):
list_of_timestamps.append(line)
content = file.read()
for i in list_of_timestamps:
result = re.sub(regex_time_xml_div,'<time>'+i+'</time>',content)
print(result,file=open(r'output\output_new.txt',"a"))
但是结果文件 output_new.txt 有 4 个空行。任何人都可以支持这一点。谢谢指教。
感谢 4Fingers,我已将代码更改为:
regex_time_xml_div_1 = r"^"
regex_time_xml_div_2 = r"\s<"
xml_time_1 = '<time>'
xml_time_2 = '</time><'
with open(r'output\output.txt',"r+") as file:
for line in file:
xml_time = re.sub(regex_time_xml_div_1, xml_time_1, line)
print(xml_time,file=open(r'output\beg_line.txt',"a"))
with open(r'output\new.txt',"r+") as file:
for line in file:
xml_time = re.sub(regex_time_xml_div_2, xml_time_2, line)
print(xml_time,file=open(r'output\after_time.txt',"a"))
就是这样,现在输出看起来像预期的那样。但是额外文件的数量看起来有点混乱。使用 os.remove
删除了这些所以基本上答案是:
regex_time_xml_div_1 = r"^"
regex_time_xml_div_2 = r"\s<"
xml_time_1 = '<time>'
xml_time_2 = '</time><'
with open(r'output\output.txt',"r+") as file:
for line in file:
xml_time = re.sub(regex_time_xml_div_1, xml_time_1, line)
print(xml_time,file=open(r'output\beg_line.txt',"a"))
with open(r'output\new.txt',"r+") as file:
for line in file:
xml_time = re.sub(regex_time_xml_div_2, xml_time_2, line)
print(xml_time,file=open(r'output\after_time.txt',"a"))
非常感谢您的帮助:)