Unicode bug flask jinja2

Question

我想用 python 回到 Flask 上创建一个网页，一切都运行良好，我强烈推荐 Flask。但是当涉及到 Unicode 等编码时，它总是很难在 python 网页等之间。

所以我有一个表格，我 post 在特定的烧瓶路线上，我得到了我的值，我需要做一些小包装以使我的变量井井有条。

我得到了这条命令：

            task_formatted.append(str(item['entity']))

我将其转换为 str，然后将其附加到列表中，以便轻松将其传递到我的模板

我希望 str 在网页上呈现为 UTF-8 python 页数：

  # -*- coding: utf-8 -*-

html 页数：

  <meta charset="utf-8"/>

然后我使用 jinja 在我的页面中打印它们：

            {% for item in task %}
            <tr>
              <td>{{item[0].decode('utf-8')}}</td>
              <td>{{item[1].decode('utf-8')}}</td>
              <td>{{item[2]}}</td>
              <td>{{item[3]}}</td>
              <td>{{item[4]}}</td>
              <td><button id="taskmodal1"></td>
            </tr>
            {% endfor %}

但是我的项目[0].decode('utf-8') 和我的项目[1].decode('utf-8')

正在打印：

{'type': 'Asset', 'id': 1404, 'name': 'Test-Asset comm\xc3\xa9'}

而不是

{'type': 'Asset', 'id': 1404, 'name': 'Test-Asset commé'}

我在 python 一侧尝试了 .encode('utf-8') 和 unicode(str) 以及 render_template().encode('utf-8') 的几种方法而且我越来越没有想法了。

公平地说，我认为它们是我不理解 Unicode 的东西，所以我想得到一些解释（不是文档 link 因为我很可能已经读过它们）或一些解决方案来获得它工作，

我的程序能够正确编写 str 非常重要，我在 js http 调用后使用它。

谢谢

PS：我正在使用 python2

Answer 1

你做错了。

<td>{{item[0].decode('utf-8')}}</td>

为什么要添加 decode？这是错误的。我建议你不要放置任何转换函数。 UTF-8 可以正常工作（我认为这是默认设置）。无论如何，你不是解码。您正在将字符串编码为 UTF-8（"encoding"：您使用 编码 UTF-8，"decoding"：从特定编码值到语义值：事实上，在 python 中，您不应该关心字符串的内部编码方式 [BTW 内部编码，一种 UTF-8、latin1、UTF-16 或 UTF-32，根据最有效的编码方式整个字符串]).

只需删除 decode('utf-8')。在 python 代码上，你不应该关心编码和解码，而是在输入和输出上：使用三明治规则。这将极大地简化字符串和逻辑的处理，并避免大多数错误

Answer 2

我找到了解决问题的方法：

unicodedata.normalize('NFKD', unicode(str(item['entity']['type']) + str(item['entity']['name']),'utf-8'))

首先，我使用 str() 将我的字典转换为字符串，然后使用 unicode('str' , 'utf-8') 在 UTF-8 Unicode 中转换它，最后在导入 unicodedata 我使用 unicodedata.normalize()

希望对大家有所帮助

Answer 3

I got this dict:
task_formatted.append(str(item['entity']))
I transform it to a str, then append it to a list so I can easily pass it to my template

此代码与您认为的不同。

>>> entity = {'type': 'Asset', 'id': 1404, 'name': 'Test-Asset commé'}
>>> str(entity)
"{'type': 'Asset', 'id': 1404, 'name': 'Test-Asset comm\xc3\xa9'}"

当你在字典（或列表）上调用 str 时，你不会得到在字典的每个键和值上调用 str 的结果：你得到 repr 每个键和值。在这种情况下，这意味着 'Test-Asset commé' 已经以一种难以逆转的方式转换为 'Test-Asset comm\xc3\xa9'。

>>> str(entity).decode('utf-8')  # <- this doesn't work.
u"{'type': 'Asset', 'id': 1404, 'name': 'Test-Asset comm\xc3\xa9'}"

如果您只想使用 {{ item }} 在模板中呈现您的词典，您可以使用 json 模块而不是 str 来序列化它们。请注意，您需要将 json（属于 str 类型）转换为 unicode 实例，以避免在呈现模板时出现 UnicodeDecodeError。

>>> import json
>>> template = jinja2.Template(u"""<td>{{item}}</td>""")
>>> j = json.dumps(d, ensure_ascii=False)
>>> uj = unicode(j, 'utf-8')
>>> print template.render(item=uj)
<td>{"type": "Asset", "id": 1404, "name": "Test-Asset commé"}</td>

一些一般性观察/总结：

不要使用str（或unicode）序列化字典或列表等容器；使用像 json or pickle.
确保您传递给 jinja2 的任何字符串文字都是 unicode 的实例，而不是 str
使用 Python2 时，如果您的代码有可能处理非 ascii 值，请始终使用 unicode，切勿使用 str。