如何将具有字节值的字符串转换回字节？

Question

我有一个程序，我将 python check_output 输出写入文件。我忘了将编码设置为 "utf-8" 并且所有输出都以字节为单位。我已将这些字节值写入文件。我现在在我的文件上有一些字符串，如“ b' math \xf0\x9d ”，其中包含 ASCII 和十六进制。如何只获取 ASCII 值并将 \xf0\x9d 等十六进制值转换为其原始值？

为了回答这个问题，我需要一种方法将具有字节值的字符串转换回字节。在下面的示例中，opt 是字节，temp 是字符串。我怎样才能再次将 temp 转换为 opt？

更多细节：这是我原本想要的代码运行。我在变量 opt 中得到的是十六进制值。我希望通过将其转换为字符串来摆脱它们，但它不起作用。

latex = "a+b"
opt = check_output(["latexmlmath", "--quiet", "--cmml=-", latex])
temp = str(opt)
# also tried
temp = str(opt).encode("utf-8")

opt 和 temp 值为：

b'<?xml version="1.0" encoding="UTF-8"?>\n<math xmlns="http://www.w3.org/1998/Math/MathML" alttext="a+b" display="block">\n  <apply>\n    <plus/>\n    <ci>\xf0\x9d\x91\x8e</ci>\n    <ci>\xf0\x9d\x91\x8f</ci>\n  </apply>\n</math>\n'
b'<?xml version="1.0" encoding="UTF-8"?>\n<math xmlns="http://www.w3.org/1998/Math/MathML" alttext="a+b" display="block">\n  <apply>\n    <plus/>\n    <ci>\xf0\x9d\x91\x8e</ci>\n    <ci>\xf0\x9d\x91\x8f</ci>\n  </apply>\n</math>\n'

Answer 1

你想要opt.decode('utf-8')；在没有第二个 (encoding) 参数的情况下对 bytes 对象调用 str 只会得到 bytes 对象的 repr。如果您有来自此类转换的可用数据，您可以将其转换回原始 bytes 对象 with ast.literal_eval，然后对结果执行预期的 decode。示例：

import ast

baddata = 'b\'<?xml version="1.0" encoding="UTF-8"?>\n<math xmlns="http://www.w3.org/1998/Math/MathML" alttext="a+b" display="block">\n  <apply>\n    <plus/>\n    <ci>\xf0\x9d\x91\x8e</ci>\n    <ci>\xf0\x9d\x91\x8f</ci>\n  </apply>\n</math>\n\''
gooddata = ast.literal_eval(baddata).decode('utf-8')
print(gooddata)

输出：

<?xml version="1.0" encoding="UTF-8"?>
<math xmlns="http://www.w3.org/1998/Math/MathML" alttext="a+b" display="block">
  <apply>
    <plus/>
    <ci></ci>
    <ci></ci>
  </apply>
</math>

如何将具有字节值的字符串转换回字节？

How to convert string with bytes value back to bytes?

python

ascii