简单的 scrapy 程序 运行 在 shell 上成功但未将数据导出到 csv

Simple scrapy program running successfully on shell but not exporting data to csv

我一直试图只从特定的 link 评论中抓取数据,但是当我 运行 它在 shell 上时它 运行 成功了但是当我我正在尝试将其导出到 csv 文件,我只得到 comment_user 而不是 comment_data 为什么?

from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.selector import Selector
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from urlparse import urljoin
from commen.items import CommenItem

class criticspider(CrawlSpider):
    name ="delh"
    allowed_domains =["consumercomplaints.in"]
    #start_urls =["http://www.consumercomplaints.in/?search=delhivery&page=2","http://www.consumercomplaints.in/?search=delhivery&page=3","http://www.consumercomplaints.in/?search=delhivery&page=4","http://www.consumercomplaints.in/?search=delhivery&page=5","http://www.consumercomplaints.in/?search=delhivery&page=6","http://www.consumercomplaints.in/?search=delhivery&page=7","http://www.consumercomplaints.in/?search=delhivery&page=8","http://www.consumercomplaints.in/?search=delhivery&page=9","http://www.consumercomplaints.in/?search=delhivery&page=10","http://www.consumercomplaints.in/?search=delhivery&page=11"]
    start_urls=["http://www.consumercomplaints.in/movement-delivery/delhivery-courier-service-c783976"]

    def parse(self,response):

        sites = response.xpath('//table[@style="width:100%"]')
        items = []

        for site in sites:
            item = CommenItem()
            item['comment_user'] = site.xpath('.//td[@class="comments"]/div[1]/a/text()').extract()
            item['comment_data'] = site.xpath('.//tr[3]/td/div/text()').extract()
            items.append(item)
        return items

parse()方法中实现的逻辑有点不正确。我会这样走:

def parse(self,response):
    sites = response.xpath('//td/div[starts-with(@id, "c")]')
    for site in sites:
        item = CommenItem()
        item['comment_user'] = site.xpath('.//td[@class="comments"]/div[1]/a/text()').extract()[0].strip()
        item['comment_data'] = ''.join(site.xpath('.//td[@class="compl-text"]/div//text()').extract()).strip()
        yield item