Pyramid/Python/SQLAlchemy 编码地狱

Pyramid/Python/SQLAlchemy encoding hell

我很困惑。

所以我在 Web 应用程序中使用 SQLAlchemy 和 Pyramid。此应用程序的功能之一是解析来自表单的输入,该表单通过 XML-RPC 桥传递给 Ruby 解析器。

当我尝试使用我的渲染器 return 新解析对象的 JSON 时出现问题。

这是错误,后面是详细信息:

 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 93: ordinal not in range(128)

设置

数据库设置

整理:utf8_general_ci

型号

class Citation(Base):
    __tablename__ = 'citations'
    __table_args__ = {'autoload' : True}
    authors = relationship("Author", secondary=author_of, backref='citations')

    possible_matches = relationship("Citation", secondary=similar_to,
        primaryjoin=citation_id==similar_to.c.citation_id1,
        secondaryjoin=citation_id==similar_to.c.citation_id2
        )

    def __init__(self, citation_dict=None):
        self.__dict__.update(citation_dict)


    def __repr__(self):
        return "<Citation %d: %s (%s)>" %\
               (self.citation_id, self.title, self.year)

    @property
    def json(self):
        attrs =\
            ['pubtype', 'abstract', 'keywords', 'doi', 'url', 'address',
             'booktitle', 'chapter', 'crossref', 'edition', 'editor',
             'translator', 'howpublished', 'institution', 'journal',
             'bibtex_key', 'month', 'note', 'number', 'organization',
             'pages', 'publisher', 'location', 'school', 'series', 'title',
             'type', 'volume', 'year', 'raw', 'verified', 'last_modified',
             'entryTime', 'citation_id']
        struct = { 'authors' : [a.json for a in self.authors] }
        for attr in attrs:
            struct[attr] = getattr(self, attr, None)

        struct["auth_string"] = " ".join([a.toString() for a in self.authors])
        return struct

查看

@view_config(route_name='citation_add', request_method='POST', renderer='pubs_json')
def citation_add(request):
    raw = request.body
    citation = parser.parse(raw)[0]

    return citation.json

渲染器

# -*- coding: utf-8 -*-

import customjson
import os
from pyramid.asset import abspath_from_asset_spec

class PubsJSONRenderer:
        def __init__(self, info):
                """ Constructor: info will be an object having the the 
                following attributes: name (the renderer name), package 
                (the package that was 'current' at the time the 
                renderer was registered), type (the renderer type 
                name), registry (the current application registry) and 
                settings (the deployment settings dictionary).        """


        def __call__(self, value, system):
                """ Call a the renderer implementation with the value 
                and the system value passed in as arguments and return 
                the result (a string or unicode object).  The value is 
                the return value of a view.         The system value is a 
                dictionary containing available system values 
                (e.g. view, context, and request). """
                request = system.get('request')
                if request is not None:
                        if not hasattr(request, 'response_content_type'):
                                request.response_content_type = 'application/json'

                return customjson.dumps(value)

customjson.py

from json import JSONEncoder
from decimal import Decimal
class ExtJsonEncoder(JSONEncoder):
    '''
    Extends ``simplejson.JSONEncoder`` by allowing it to encode any
    arbitrary generator, iterator, closure or functor.
    '''
    def default(self, c):
        # Handles generators and iterators
        if hasattr(c, '__iter__'):
            return [i for i in c]

        # Handles closures and functors
        if hasattr(c, '__call__'):
            return c()

        # Handles precise decimals with loss of precision to float.
        # Hack, but it works
        if isinstance(c, Decimal):
            return float(c)

        return JSONEncoder.default(self, c)

def dumps(*args):
    '''
    Shortcut for ``ExtJsonEncoder.encode()``
    '''
    return ExtJsonEncoder(sort_keys=False, ensure_ascii=False,
            skipkeys=True).encode(*args)

堆栈跟踪

 Traceback (most recent call last):
   File     "/var/site/siteenv/lib/python2.7/site-packages/pyramid/router.py", line 242, in __call__
     response = self.invoke_subrequest(request, use_tweens=True)
   File "/var/site/siteenv/lib/python2.7/site-packages/pyramid/router.py", line 217, in invoke_subrequest
     response = handle_request(request)
   File "/var/site/siteenv/lib/python2.7/site-packages/pyramid_debugtoolbar/toolbar.py", line 160, in toolbar_tween
     return handler(request)
   File "/var/site/siteenv/lib/python2.7/site-packages/pyramid/tweens.py", line 21, in excview_tween
     response = handler(request)
   File "/var/site/siteenv/lib/python2.7/site-packages/pyramid_tm/__init__.py", line 82, in tm_tween
     reraise(*exc_info)
   File "/var/site/siteenv/lib/python2.7/site-packages/pyramid_tm/__init__.py", line 63, in tm_tween
     response = handler(request)
   File "/var/site/siteenv/lib/python2.7/site-packages/pyramid/router.py", line 163, in handle_request
     response = view_callable(context, request)
   File "/var/site/siteenv/lib/python2.7/site-packages/pyramid/config/views.py", line 329, in attr_view
     return view(context, request)
   File "/var/site/siteenv/lib/python2.7/site-packages/pyramid/config/views.py", line 305, in predicate_wrapper
     return view(context, request)
   File "/var/site/siteenv/lib/python2.7/site-packages/pyramid/config/views.py", line 377, in rendered_view
     context)
   File "/var/site/sitvenv/lib/python2.7/site-packages/pyramid/renderers.py", line 418, in render_view
     return self.render_to_response(response, system, request=request)
   File "/var/site/siteenv/lib/python2.7/site-packages/pyramid/renderers.py", line 441, in render_to_response
     result = self.render(value, system_values, request=request)
   File "/var/site/siteenv/lib/python2.7/site-packages/pyramid/renderers.py", line 437, in render
     result = renderer(value, system_values)
   File "/var/site/renderers.py", line 30, in __call__
     return customjson.dumps(value)
   File "/var/site/customjson.py", line 38, in dumps
     skipkeys=True).encode(*args)
   File "/usr/lib/python2.7/json/encoder.py", line 203, in encode
     return ''.join(chunks)
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 93: ordinal not in range(128)

Return 来自解析器

我们进食

Allen, C. 1995 "It isn't what you think: a new idea about intentional causation." Noûs 29,1:115-126

我们从解析器中得到一个像这样的字典对象:

{'title': '\\"It isn\'t what you think: a new idea about intentional causation.\\"', 'journal': 'No\\xc3\\xbbs', 'author': 'Allen, C.', 'number': 1, 'volume': 29, 'date': '1995', 'type': 'article', 'pages': u'115\u2013126'}

尝试过

因为该应用程序在虚拟环境中运行,所以我觉得可以跳到 page.py 并将默认编码从 ascii 更改为 utf-8

我试过编码和解码以及将 charset=utf8&use_unicode=1 添加到我的 SQLAlchemy URL 都无济于事。

我怀疑问题出在 customjson.py 文件中的 ensure_ascii=False 选项。事实上,Python 2.7 JSON 编码器的文档说明如下:

if ensure_ascii is False, some chunks written to fp may be unicode instances. This usually happens because the input contains unicode strings or the encoding parameter is used. Unless fp.write() explicitly understands unicode (as in codecs.getwriter()) this is likely to cause an error.

设置 ensure_ascii=True 似乎可以解决该错误。鉴于 json 编码器的默认编码已经是 utf-8,我不确定手动设置它是否有帮助。我需要那些 unicode 字符,所以我不太确定如何解决这个问题。

在客户端调用 JSON.stringify 来转义麻烦的字符。删除此导致 python 按预期工作。