Python 写入文件时处理换行符和制表符

Question

我正在写一些从一个源文件中提取的文本（包括 \n 和 \t 个字符）到一个（文本）文件中；例如：

源文件（test.cpp）：

/*
 * test.cpp
 *
 *    2013.02.30
 *
 */

取自源文件并存储在一个字符串变量中，像这样

test_str = "/*\n test.cpp\n *\n *\n *\n\t2013.02.30\n *\n */\n"

当我使用

写入文件时

    with open(test.cpp, 'a') as out:
        print(test_str, file=out)

正在使用转换为换行符和制表符的换行符和制表符编写（与 test.cpp 完全一样）而我想要它们 保持 \n 和 \t 完全像 test_str 变量将它们放在第一位。

有没有办法在 Python 中将这些 'special characters' 写入文件而不翻译它们？

Answer 1

使用replace()。由于您需要多次使用它，因此您可能需要查看 this.

test_str = "/*\n test.cpp\n *\n *\n *\n\t2013.02.30\n *\n */\n"
with open("somefile", "w") as f:
    test_str = test_str.replace('\n','\n')
    test_str = test_str.replace('\t','\t')
    f.write(test_str)

Answer 2

您可以使用 str.encode:

with open('test.cpp', 'a') as out:
    print(test_str.encode('unicode_escape').decode('utf-8'), file=out)

这将转义所有 Python 可识别的特殊转义字符。

以你的例子为例：

>>> test_str = "/*\n test.cpp\n *\n *\n *\n\t2013.02.30\n *\n */\n"
>>> test_str.encode('unicode_escape')
b'/*\n test.cpp\n *\n *\n *\n\t2013.02.30\n *\n */\n'

Answer 3

I want them to remain \n and \t exactly like the test_str variable holds them in the first place.

test_str 不包含反斜杠 \ + t（两个字符）。它包含单个字符 ord('\t') == 9（与 test.cpp 中的字符相同）。反斜杠在 Python 字符串文字中是特殊的，例如，u'\U0001f600' 不是 ten 个字符——它是单个字符不要在运行时混淆内存中的字符串对象和在 Python 源代码中将其文本表示为字符串文字。

JSON 可能是比 unicode-escape 编码更好的替代方案来存储文本（更便携），即使用：

import json

with open('test.json', 'w') as file:
    json.dump({'test.cpp': test_str}, file)

而不是 test_str.encode('unicode_escape').decode('ascii')。

要读回json：

with open('test.json') as file:
    test_str = json.load(file)['test.cpp']

Python 写入文件时处理换行符和制表符

Python handling newline and tab characters when writing to file

python

newline

python-3.x