文件中的正则表达式字符串替换

Regex string substitution in file

我需要替换文件中的字符串。但暂时没有成功。欢迎任何建议。

我有包含以下内容的文件 output.txt:

2021-07-28 10:27:49,869 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT <xml code following></xml code following> 
2021-07-28 10:27:49,881 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT <xml code following></xml code following> 
2021-07-28 10:27:51,834 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT <xml code following></xml code following> 
2021-07-28 10:27:52,182 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT <xml code following></xml code following> 

我有代码,用于制作每一行的第一部分:

2021-07-28 10:27:52,182 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT

拥有:

<time>2021-07-28 10:27:52,182 qwer123 instanceA 10.10.10.1 aaaaa/111 ABC DEFAULT</time>

代码如下:

regex_time_xml_div = r"\d+-\d+-\d+ \d+:\d+:\d+,\d+\s[0-z]{7}\s[0-z]{9}\s.{34}"
            with open(r'output\output.txt',"r+") as file:
                list_of_timestamps = []
                for line in file:
                    if re.search(regex_time_xml_div, str(line)):
                        list_of_timestamps.append(line)
                content = file.read()
                for i in list_of_timestamps:
                    result = re.sub(regex_time_xml_div,'<time>'+i+'</time>',content)
                    print(result,file=open(r'output\output_new.txt',"a"))

但是结果文件 output_new.txt 有 4 个空行。任何人都可以支持这一点。谢谢指教。


感谢 4Fingers,我已将代码更改为:

            regex_time_xml_div_1 = r"^"
            regex_time_xml_div_2 = r"\s<"
            xml_time_1 = '<time>'
            xml_time_2 = '</time><'
            with open(r'output\output.txt',"r+") as file:
                for line in file:
                    xml_time = re.sub(regex_time_xml_div_1, xml_time_1, line)      
                    print(xml_time,file=open(r'output\beg_line.txt',"a")) 
            
            with open(r'output\new.txt',"r+") as file:
                for line in file:
                    xml_time = re.sub(regex_time_xml_div_2, xml_time_2, line)
                    print(xml_time,file=open(r'output\after_time.txt',"a"))

就是这样,现在输出看起来像预期的那样。但是额外文件的数量看起来有点混乱。使用 os.remove

删除了这些

所以基本上答案是:

            regex_time_xml_div_1 = r"^"
            regex_time_xml_div_2 = r"\s<"
            xml_time_1 = '<time>'
            xml_time_2 = '</time><'
            with open(r'output\output.txt',"r+") as file:
                for line in file:
                    xml_time = re.sub(regex_time_xml_div_1, xml_time_1, line)      
                    print(xml_time,file=open(r'output\beg_line.txt',"a")) 
            
            with open(r'output\new.txt',"r+") as file:
                for line in file:
                    xml_time = re.sub(regex_time_xml_div_2, xml_time_2, line)
                    print(xml_time,file=open(r'output\after_time.txt',"a"))

非常感谢您的帮助:)