如何将包含unicode内容的文件上传到服务器?

How to upload file to the server with unicode content?

我尝试使用以下代码上传文件到服务器:

def build_request(self, theurl, fields, files, txheaders=None):
    content_type, body = self.encode_multipart_formdata(fields, files)
    if not txheaders: txheaders = {}
    txheaders['Content-type'] = content_type
    txheaders['Content-length'] = str(len(body))
    return urllib2.Request(theurl, body, txheaders)

def encode_multipart_formdata(self,fields, files, BOUNDARY = '-----'+mimetools.choose_boundary()+'-----'):
    ''' from www.voidspace.org.uk/atlantibots/pythonutils.html '''
    CRLF = '\r\n'
    L = []
    if isinstance(fields, dict):
        fields = fields.items()
    for (key, value) in fields:
        L.append('--' + BOUNDARY)
        L.append('Content-Disposition: form-data; name="%s"' % key)
        L.append('')
        L.append(value)
    for (key, filename, value) in files:
        filetype = mimetypes.guess_type(filename)[0] or 'application/octet-stream'
        L.append('--' + BOUNDARY)
        L.append('Content-Disposition: form-data; name="%s"; filename="%s"' % (key, filename))
        L.append('Content-Type: %s' % filetype)
        L.append('')
        L.append(value)
    L.append('--' + BOUNDARY + '--')
    L.append('')
    body = CRLF.join(L)
    content_type = 'multipart/form-data; boundary=%s' % BOUNDARY
    return content_type, body

mp3 = ('file', file, open(file,'rb').read())
url = self.build_request(upload_url, {}, (mp3,))
res = urllib2.urlopen(url)

我上传 mp3 文件 (0001.mp3) 并收到以下错误 -

(type 'exceptions.UnicodeDecodeError', UnicodeDecodeError('ascii', '-------192.1xx.xx.xx.501.5413.1420317280.341.1-----\r\nContent-Disposition: form-data; name="file"; filename="/Users/Me/Music/0001.mp3"\r\nContent-Type: audio/mpeg\r\n\r\nID3\x04\x00\x00\x00\x00\x00#TSSE\x00\x00\x00\x0f\x00\x00\x03Lavf55.33.100\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xfb\x94\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00Info\x00\x00\x00\x07\x00\x00#\xf9\x005\xf7\x00\x00\x02\x05\x08\n\r\x10\x12\x14\x17\x19\x1c\x1f!$\'(+.0368;=@BEHJMOQTWY\_acfhknpsvxz}\x80\x82\x85\x88\x89\x8c\x8f\x91\x94\x97\x99\x9c\x9e\xa1\xa3\xa6\xa9\xab\xae\xb1......UUUUU', 45, 46, 'ordinal not in range(128)'), )

有什么问题吗?

更新。完整的回溯如下:

Traceback (most recent call last):
  File "test.py", line 258, in <module>
    upload()
  File "test.py", line 242, in upload
    success = uploadFile(file)
  File "test.py", line 179, in uploadFile
    res = urllib2.urlopen(url)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 404, in open
    response = self._open(req, data)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 422, in _open
    '_open', req)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1214, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1181, in do_open
    h.request(req.get_method(), req.get_selector(), req.data, headers)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 973, in request
    self._send_request(method, url, body, headers)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1007, in _send_request
    self.endheaders(body)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 969, in endheaders
    self._send_output(message_body)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 827, in _send_output
    msg += message_body
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 220: ordinal not in range(128)

异常回溯显示 Python 试图解码 您的请求 body;因为它是二进制数据,Python 用于隐式解码 (ASCII) 的默认编码在这里失败。

Python 试图解码您的请求 body 因为要发送到服务器的第一部分 HTTP headers(包括带有方法、路径和 HTTP 的初始请求行版本),已经是 unicode object。如果您的 headers 您的 URL 中至少有一个是 unicode object 并且其余部分可解码为 ASCII,则会发生这种情况。

确保您的 headers 和 URL 已编码为字节。如果它们不是 ASCII,则需要对它们进行显式编码; headers 通常使用 Latin1 或不透明字节(HTTP 1.0 提供了一个从未有人使用过的 mime-compatible 编码选项),URL 必须是 ASCII,任何路径元素或查询参数编码为 UTF- 8 然后 URL 编码(示例代码参见 how to deal with ® in url for urllib2.urlopen?)。

因为你没有传递额外的 headers,我假设你的 URLunicode object 这里:

url = self.build_request(upload_url.encode('ascii'), {}, (mp3,))