在 python markdown 中解析 table 时出现 Unicode 错误
Unicode error while parsing table in python markdown
我正在使用 Python Markdown 来解析以下内容 table。
Escape sequences | Character represented
-----------------|--------------------------
\b | Backspace
\t | Tab
\f | Form feed
\n | New line
\r | Carriage return
\ | Backslash
\' | Single quote
\" | Double quote
\uNNNN | where NNNN is a unicode number, with this escape sequence you can print unicode characters
这是我正在使用的代码
html = markdown.markdown(str, extensions=['markdown.extensions.tables', 'markdown.extensions.fenced_code',
'markdown.extensions.toc', 'markdown.extensions.wikilinks'])
print(html)
这里是错误
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 1000-1001: truncated \uXXXX escape
这里的问题是您的输入字符串包含具有 special meaning 的反斜杠符号。要让它工作,您的输入数据应该如下所示:
Escape sequences | Character represented
-----------------|--------------------------
\b | Backspace
\t | Tab
\f | Form feed
\n | New line
\r | Carriage return
\\ | Backslash
\' | Single quote
\" | Double quote
\uNNNN | where NNNN is a unicode number, with this escape sequence you can print unicode characters
即反斜杠应该自己转义。
达到这个目的的愚蠢方法 - 可能只是在使用 markdown 解析之前进行一些预处理:
str.replace('\', '\\') # yes, here too :)
我正在使用 Python Markdown 来解析以下内容 table。
Escape sequences | Character represented
-----------------|--------------------------
\b | Backspace
\t | Tab
\f | Form feed
\n | New line
\r | Carriage return
\ | Backslash
\' | Single quote
\" | Double quote
\uNNNN | where NNNN is a unicode number, with this escape sequence you can print unicode characters
这是我正在使用的代码
html = markdown.markdown(str, extensions=['markdown.extensions.tables', 'markdown.extensions.fenced_code',
'markdown.extensions.toc', 'markdown.extensions.wikilinks'])
print(html)
这里是错误
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 1000-1001: truncated \uXXXX escape
这里的问题是您的输入字符串包含具有 special meaning 的反斜杠符号。要让它工作,您的输入数据应该如下所示:
Escape sequences | Character represented
-----------------|--------------------------
\b | Backspace
\t | Tab
\f | Form feed
\n | New line
\r | Carriage return
\\ | Backslash
\' | Single quote
\" | Double quote
\uNNNN | where NNNN is a unicode number, with this escape sequence you can print unicode characters
即反斜杠应该自己转义。 达到这个目的的愚蠢方法 - 可能只是在使用 markdown 解析之前进行一些预处理:
str.replace('\', '\\') # yes, here too :)