Python 解析 JSON url 时出现 unicodeDecodeError
Python unicodeDecodeError on parsing a JSON url
我正在使用 python 3.4 并尝试解析来自 url 的看似有效的 JSON 输出。前任:
http://api.stackexchange.com/2.2/questions?order=desc&sort=activity&site=Whosebug
这就是我的代码的样子
import json
from urllib.request import urlopen
def jsonify(url):
response = urlopen(url).read().decode('utf8')
repo = json.loads(response)
return repo
url = jsonify('http://api.stackexchange.com/2.2/questions?order=desc&sort=activity&site=Whosebug');
但是,我收到诸如 UnicodeDecodeError utf-8 codec can't decode byte 0x8b in position 1; invalid start byte
之类的错误
该脚本适用于任何其他 API,如 github 和许多其他脚本,但不适用于 stackexchange api
响应是使用gzip
压缩的,您必须解压缩它。
$ curl -v http://api.stackexchange.com/2.2/questions\?order\=desc\&sort\=activity\&site\=Whosebug
* Trying 198.252.206.16...
* TCP_NODELAY set
* Connected to api.stackexchange.com (198.252.206.16) port 80 (#0)
> GET /2.2/questions?order=desc&sort=activity&site=Whosebug HTTP/1.1
> Host: api.stackexchange.com
> User-Agent: curl/7.51.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Cache-Control: private
< Content-Type: application/json; charset=utf-8
< Content-Encoding: gzip
有关详细信息,请参阅 api.stackexchange docs。
解压示例:
import gzip
def jsonify(url):
response = urlopen(url).read()
tmp = gzip.decompress(response).decode('utf-8')
repo = json.loads(tmp)
return repo
我正在使用 python 3.4 并尝试解析来自 url 的看似有效的 JSON 输出。前任: http://api.stackexchange.com/2.2/questions?order=desc&sort=activity&site=Whosebug
这就是我的代码的样子
import json
from urllib.request import urlopen
def jsonify(url):
response = urlopen(url).read().decode('utf8')
repo = json.loads(response)
return repo
url = jsonify('http://api.stackexchange.com/2.2/questions?order=desc&sort=activity&site=Whosebug');
但是,我收到诸如 UnicodeDecodeError utf-8 codec can't decode byte 0x8b in position 1; invalid start byte
该脚本适用于任何其他 API,如 github 和许多其他脚本,但不适用于 stackexchange api
响应是使用gzip
压缩的,您必须解压缩它。
$ curl -v http://api.stackexchange.com/2.2/questions\?order\=desc\&sort\=activity\&site\=Whosebug
* Trying 198.252.206.16...
* TCP_NODELAY set
* Connected to api.stackexchange.com (198.252.206.16) port 80 (#0)
> GET /2.2/questions?order=desc&sort=activity&site=Whosebug HTTP/1.1
> Host: api.stackexchange.com
> User-Agent: curl/7.51.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Cache-Control: private
< Content-Type: application/json; charset=utf-8
< Content-Encoding: gzip
有关详细信息,请参阅 api.stackexchange docs。
解压示例:
import gzip
def jsonify(url):
response = urlopen(url).read()
tmp = gzip.decompress(response).decode('utf-8')
repo = json.loads(tmp)
return repo