无法在 scrapy 中将 unicode 转换为 json
Unable to convert unicode into json in scrapy
import scrapy
import json
class GettingtonDSpider(scrapy.Spider):
name = "gettington_d"
allowed_domains = ["gettington.com"]
start_urls = ['https://api.gettington.com/v1/products?showMPP=false&rows=24&q=Keyword:south%20shore%20furniture&productfilter=null&callback=searchCallback']
def parse(self, response):
jsonresp = json.dumps(response.body)
jsonresp= json.loads(jsonresp)
我试过很多方法都失败了:
- response.text
- 编码('utf-8')
- response_body_as_unicode
以上的 None 有效。错误如何解决?
您必须先从 response.body
中删除不必要的信息,这不是 JSON 可序列化的:
import re
...
json_string = re.search(r'searchCallback\((.*)\)', response.body).group(1);
jsonresp = json.loads(json_string)
现在你在 jsonresp
中有一个 dict
import scrapy
import json
class GettingtonDSpider(scrapy.Spider):
name = "gettington_d"
allowed_domains = ["gettington.com"]
start_urls = ['https://api.gettington.com/v1/products?showMPP=false&rows=24&q=Keyword:south%20shore%20furniture&productfilter=null&callback=searchCallback']
def parse(self, response):
jsonresp = json.dumps(response.body)
jsonresp= json.loads(jsonresp)
我试过很多方法都失败了:
- response.text
- 编码('utf-8')
- response_body_as_unicode
None 有效。错误如何解决?
您必须先从 response.body
中删除不必要的信息,这不是 JSON 可序列化的:
import re
...
json_string = re.search(r'searchCallback\((.*)\)', response.body).group(1);
jsonresp = json.loads(json_string)
现在你在 jsonresp
dict