python3: Unescape unicode转义被未转义的字符包围

Question

我收到 json 数据，其中一些 unicode 字符已转义，而另一些则没有。

>>> example = r'сло\u0301во'

对这些字符进行转义的最佳方法是什么？在下面的示例中，函数 unescape 会是什么样子？是否有执行此操作的内置函数？

>>> unescape(example)
сло́во

Answer 1

此解决方案假定原始字符串中 \u 的每个实例都是 unicode 转义：

def unescape(in_str):
    """Unicode-unescape string with only some characters escaped."""
    in_str = in_str.encode('unicode-escape')   # bytes with all chars escaped (the original escapes have the backslash escaped)
    in_str = in_str.replace(b'\\u', b'\u')  # unescape the \
    in_str = in_str.decode('unicode-escape')   # unescape unicode
    return in_str

...或在一行中...

def unescape(in_str):
    """Unicode-unescape string with only some characters escaped."""
    return in_str.encode('unicode-escape').replace(b'\\u', b'\u').decode('unicode-escape')

python3: Unescape unicode转义被未转义的字符包围

python3: Unescape unicode escapes surrounded by unescaped characters

python

unicode

escaping

unicode-escapes

python-3.x