使用 scrapy 抓取 JSON 时出现此错误：Spider must return request, item, or None, got 'str'

Question

我正在尝试使用 scrapy 获取带有键“longName”的 json 字段，但我收到错误消息：“Spider must return request, item, or None,得到 'str'".

我正在尝试抓取的 JSON 看起来像这样：

{
   "id":5355,
   "code":9594,

}伤心难过

这是我的代码：

import scrapy
import json

class NotesSpider(scrapy.Spider):
    name = 'notes'
    allowed_domains = ['blahblahblah.com']
    start_urls = ['https://blahblahblah.com/api/123']

    def parse(self, response):
        data = json.loads(response.body)
        yield from data['longName']

当我在运行提示“scrapy crawl notes”时出现上述错误。谁能给我指出正确的方向？

Answer 1

如果您只想 longName 像这样修改您的解析方法应该可以解决问题：

    def parse(self, response):
        data = json.loads(response.body)
        yield {"longName": data["longName"]}

使用 scrapy 抓取 JSON 时出现此错误：Spider must return request, item, or None, got 'str'

Getting this error when scraping JSON with scrapy: Spider must return request, item, or None, got 'str'

python

scrapy

web-scraping

anaconda