UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 1: ordinal not in range(128)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 1: ordinal not in range(128)
所以,正如问题的标题所说,我对 encoding/decoding 个字符串有疑问。
我正在使用:
python 2.7 |姜戈 1.11 | jinja2 2.8
基本上,我从数据库中检索一些数据,我对其进行序列化,在其上设置缓存,然后获取缓存,将其反序列化并将其呈现给模板。
问题:
我有一些人的名字和姓氏,这些人的名字中都有“ă”之类的字符。
我使用 json.dumps.
进行序列化
序列化字典的示例如下(我有 10 个这样的):
active_agents = User.region_objects.get_active_agents()
agents_by_commission_last_month = active_agents.values(....
"first_name", "last_name").order_by(
'-total_paid_transaction_value_last_month')
然后,当我设置缓存时,我会这样做:
for key, value in context.items():
......
value = json.dumps(list(value), default=str, ensure_ascii=False).encode('utf-8')
,其中 value 是上述代码中 .values()
返回的字典列表,key 是 region_agents_by_commission_last_month
(就像前面代码中的变量)
现在,我必须获取缓存。所以我在做同样的过程,但是相反。
serialized_keys = ['agencies_by_commission_last_month',
'region_agents_by_commission_last_month', 'region_agents_by_commission_last_12_months',
'region_agents_by_commission_last_30_days',
'agencies_by_commission_last_year',
'agencies_by_commission_last_12_months',
'agencies_by_commission_last_30_days',
'region_agents_by_commission_last_year',
'agency',
'for_agent']
context = {}
for key, value in region_ranking_cache.items():
if key in serialized_keys:
objects = json.loads(value, object_hook=_decode_dict)
for serilized_dict in objects:
....
d['full_name'] = d['first_name'] + " " + d['last_name']
full_name = d['full_name'].decode('utf-8').encode('utf-8')
d['full_name'] = full_name
print(d['full_name'])
....
其中 _decode_dict 对于 object_hook 看起来像:
打印结果:Cătălin Pintea,没问题。
但是在我呈现的字典中:'full_name': 'C\xc4\x83t\xc4\x83lin Pintea',
def _decode_list(data):
rv = []
for item in data:
if isinstance(item, unicode):
item = item.encode('utf-8')
elif isinstance(item, list):
item = _decode_list(item)
elif isinstance(item, dict):
item = _decode_dict(item)
rv.append(item)
return rv
def _decode_dict(data):
rv = {}
for key, value in data.items():
if isinstance(key, unicode):
key = key.encode('utf-8')
if isinstance(value, unicode):
value = value.encode('utf-8')
elif isinstance(value, list):
value = _decode_list(value)
elif isinstance(value, dict):
value = _decode_dict(value)
rv[key] = value
return rv
基本上,当 json.loads
.
时,我使用此 object 钩子函数将所有键和值编码()为 utf-8
这就是我避免在 views.py
.
中抛出此错误的方法
错误
模板某处,我正在使用:
<td>{{ agent.full_name }}</td>
而agent.full_name来自:'full_name': 'C\xc4\x83t\xc4\x83lin Pintea',
回溯
Traceback:
File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/exception.py" in inner
41. response = get_response(request)
File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in _legacy_get_response
249. response = self._get_response(request)
File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in _get_response
187. response = self.process_exception_by_middleware(e, request)
File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in _get_response
185. response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/usr/local/lib/python2.7/dist-packages/django/utils/decorators.py" in inner
185. return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/django/contrib/auth/decorators.py" in _wrapped_view
23. return view_func(request, *args, **kwargs)
File "/app/crmrebs/utils/__init__.py" in wrapper
255. return http_response_class(t.render(output, request))
File "/usr/local/lib/python2.7/dist-packages/django_jinja/backend.py" in render
106. return mark_safe(self.template.render(context))
File "/usr/local/lib/python2.7/dist-packages/jinja2/environment.py" in render
989. return self.environment.handle_exception(exc_info, True)
File "/usr/local/lib/python2.7/dist-packages/jinja2/environment.py" in handle_exception
754. reraise(exc_type, exc_value, tb)
File "/app/crmrebs/jinja2/ranking/dashboard_ranking.html" in top-level template code
1. {% extends "base.html" %}
File "/app/crmrebs/jinja2/base.html" in top-level template code
1. {% extends "base_stripped.html" %}
File "/app/crmrebs/jinja2/base_stripped.html" in top-level template code
94. {% block content %}
File "/app/crmrebs/jinja2/ranking/dashboard_ranking.html" in block "content"
83. {% include "dashboard/region_ranking.html" %}
File "/app/crmrebs/jinja2/dashboard/region_ranking.html" in top-level template code
41. {% include "dashboard/_agent_ranking_row_month.html" %}
File "/app/crmrebs/jinja2/dashboard/_agent_ranking_row_month.html" in top-level template code
2. <td>{{ agent.full_name }}</td>
Exception Type: UnicodeDecodeError at /ranking
Exception Value: 'ascii' codec can't decode byte 0xc4 in position 1: ordinal not in range(128)
这就是错误的来源。我尝试了其他东西,但我猜这是 python 2.7 的限制。我通常使用 python 3.9,但对于这个项目我必须使用 2.7。
我在这里尝试了其他答案,但没有任何帮助。
谁能帮我正确序列化这本词典,我怎样才能避免这种混乱?
希望我说清楚了。
祝大家有个愉快的一天!
所以,我设法解决了我的问题。
- 我发现
active_agents.values(...."first_name", "last_name").order_by('-total_paid_transaction_value_last_month')
检索到一个字典,其中它的键和值已经是 unicode(因为它在 models.py、django 1.11 和 python2 中的配置方式.7. 所以,序列化的过程就好了。
进入模板的最终结果确实看起来像 ’C\xc4\x83t\xc4\x83lin'
。错误来自 /xc4/.
- 为了在模板上修复它,我这样做了:
{{ agent.full_name.decode("utf-8") }},这给了我正确的结果:
Cătălin Pintea
谢谢@BoarGules。 d['last_name']
和 d['first_name']
确实是在 unicode 中。所以当我进行连接时,我不得不添加 u" "
.
所以,正如问题的标题所说,我对 encoding/decoding 个字符串有疑问。
我正在使用: python 2.7 |姜戈 1.11 | jinja2 2.8
基本上,我从数据库中检索一些数据,我对其进行序列化,在其上设置缓存,然后获取缓存,将其反序列化并将其呈现给模板。
问题:
我有一些人的名字和姓氏,这些人的名字中都有“ă”之类的字符。 我使用 json.dumps.
进行序列化序列化字典的示例如下(我有 10 个这样的):
active_agents = User.region_objects.get_active_agents()
agents_by_commission_last_month = active_agents.values(....
"first_name", "last_name").order_by(
'-total_paid_transaction_value_last_month')
然后,当我设置缓存时,我会这样做:
for key, value in context.items():
......
value = json.dumps(list(value), default=str, ensure_ascii=False).encode('utf-8')
,其中 value 是上述代码中 .values()
返回的字典列表,key 是 region_agents_by_commission_last_month
(就像前面代码中的变量)
现在,我必须获取缓存。所以我在做同样的过程,但是相反。
serialized_keys = ['agencies_by_commission_last_month',
'region_agents_by_commission_last_month', 'region_agents_by_commission_last_12_months',
'region_agents_by_commission_last_30_days',
'agencies_by_commission_last_year',
'agencies_by_commission_last_12_months',
'agencies_by_commission_last_30_days',
'region_agents_by_commission_last_year',
'agency',
'for_agent']
context = {}
for key, value in region_ranking_cache.items():
if key in serialized_keys:
objects = json.loads(value, object_hook=_decode_dict)
for serilized_dict in objects:
....
d['full_name'] = d['first_name'] + " " + d['last_name']
full_name = d['full_name'].decode('utf-8').encode('utf-8')
d['full_name'] = full_name
print(d['full_name'])
....
其中 _decode_dict 对于 object_hook 看起来像:
打印结果:Cătălin Pintea,没问题。
但是在我呈现的字典中:'full_name': 'C\xc4\x83t\xc4\x83lin Pintea',
def _decode_list(data):
rv = []
for item in data:
if isinstance(item, unicode):
item = item.encode('utf-8')
elif isinstance(item, list):
item = _decode_list(item)
elif isinstance(item, dict):
item = _decode_dict(item)
rv.append(item)
return rv
def _decode_dict(data):
rv = {}
for key, value in data.items():
if isinstance(key, unicode):
key = key.encode('utf-8')
if isinstance(value, unicode):
value = value.encode('utf-8')
elif isinstance(value, list):
value = _decode_list(value)
elif isinstance(value, dict):
value = _decode_dict(value)
rv[key] = value
return rv
基本上,当 json.loads
.
这就是我避免在 views.py
.
错误
模板某处,我正在使用:
<td>{{ agent.full_name }}</td>
而agent.full_name来自:'full_name': 'C\xc4\x83t\xc4\x83lin Pintea',
回溯
Traceback:
File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/exception.py" in inner
41. response = get_response(request)
File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in _legacy_get_response
249. response = self._get_response(request)
File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in _get_response
187. response = self.process_exception_by_middleware(e, request)
File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in _get_response
185. response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/usr/local/lib/python2.7/dist-packages/django/utils/decorators.py" in inner
185. return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/django/contrib/auth/decorators.py" in _wrapped_view
23. return view_func(request, *args, **kwargs)
File "/app/crmrebs/utils/__init__.py" in wrapper
255. return http_response_class(t.render(output, request))
File "/usr/local/lib/python2.7/dist-packages/django_jinja/backend.py" in render
106. return mark_safe(self.template.render(context))
File "/usr/local/lib/python2.7/dist-packages/jinja2/environment.py" in render
989. return self.environment.handle_exception(exc_info, True)
File "/usr/local/lib/python2.7/dist-packages/jinja2/environment.py" in handle_exception
754. reraise(exc_type, exc_value, tb)
File "/app/crmrebs/jinja2/ranking/dashboard_ranking.html" in top-level template code
1. {% extends "base.html" %}
File "/app/crmrebs/jinja2/base.html" in top-level template code
1. {% extends "base_stripped.html" %}
File "/app/crmrebs/jinja2/base_stripped.html" in top-level template code
94. {% block content %}
File "/app/crmrebs/jinja2/ranking/dashboard_ranking.html" in block "content"
83. {% include "dashboard/region_ranking.html" %}
File "/app/crmrebs/jinja2/dashboard/region_ranking.html" in top-level template code
41. {% include "dashboard/_agent_ranking_row_month.html" %}
File "/app/crmrebs/jinja2/dashboard/_agent_ranking_row_month.html" in top-level template code
2. <td>{{ agent.full_name }}</td>
Exception Type: UnicodeDecodeError at /ranking
Exception Value: 'ascii' codec can't decode byte 0xc4 in position 1: ordinal not in range(128)
这就是错误的来源。我尝试了其他东西,但我猜这是 python 2.7 的限制。我通常使用 python 3.9,但对于这个项目我必须使用 2.7。 我在这里尝试了其他答案,但没有任何帮助。
谁能帮我正确序列化这本词典,我怎样才能避免这种混乱?
希望我说清楚了。
祝大家有个愉快的一天!
所以,我设法解决了我的问题。
- 我发现
active_agents.values(...."first_name", "last_name").order_by('-total_paid_transaction_value_last_month')
检索到一个字典,其中它的键和值已经是 unicode(因为它在 models.py、django 1.11 和 python2 中的配置方式.7. 所以,序列化的过程就好了。 进入模板的最终结果确实看起来像’C\xc4\x83t\xc4\x83lin'
。错误来自 /xc4/. - 为了在模板上修复它,我这样做了:
{{ agent.full_name.decode("utf-8") }},这给了我正确的结果:
Cătălin Pintea
谢谢@BoarGules。 d['last_name']
和 d['first_name']
确实是在 unicode 中。所以当我进行连接时,我不得不添加 u" "
.