为什么 Python 3.8 将“\x48”之类的字符串解释为 "H"。我想把它写成字符串中的“\x48”

Question

我正在使用 python 编写一个简单的程序，该程序将示例 C++“.cpp”文件作为字符串，然后找到声明到其中的所有字符串。然后我想将该字符串替换为等效的十六进制代码，例如“H”等于“\x48”。

我的密码是

f = open("sample.cpp", "r")
f1 = f.read()
regex = r"\"(?:(?:(?!(?<!\)\").)*)\""

ii=0
for str2021 in find:
   print("Output Of Encode=")
   str2021="".join(r'\x{0:x}'.format(ord(c)) for c in find[ii])
   print (str2021)
   ii=ii+1

subst='\x22\x48\x22'
result = re.sub(regex,subst, f1, 0)
if result:
  print("substituted op=")
  print (result)

现在，当我打印结果值时，它显示“H”而不是“\x22\x48\x22”。我如何在 python 3.8 中强制执行此操作？

如果我这样做的话 result = re.sub(regex,str2021, f1, 0) 它给出了一个错误 raise s.error('bad escape %s' % this, len(this)) re.error: bad escape \x at position 0

我想对其进行迭代，以便在 cpp 文件中使用正则表达式找到每个字符串时，代码会自动将字符串转换为等效的 unicode 十六进制代码，如下所示

Sample.cpp string a="abc"; string b="H";

它应该像这样更改这个 cpp 文件

string a="\x61\x62\x63"; string b="\x48";

请提出解决方案

Answer 1

您可以使用原始字符串：

subst = r'\x22\x48\x22'

然后，也更改您的 re.sub 调用：

re.sub(regex, re.escape(subst), f1, 0)

Python 文档说，不过只转义反斜杠更好：

re.sub(regex, subst.replace("\", r"\"), f1, 0)

为什么 Python 3.8 将“\x48”之类的字符串解释为 "H"。我想把它写成字符串中的“\x48”

Why Python 3.8 interpret String like "\x48" as "H". I want to write it as "\x48" in a string

python

string

unicode-escapes

python-unicode