scrapy 中的 Unicode 问题 python

Question

两个小时以来，我一直在搜索这个主题，我尝试了很多解决方案，但没有解决我的问题先上代码

import scrapy

class HamburgSpider(scrapy.Spider):
    name = 'hamburg'
    #allowed_domains = ['https://www.hamburg.de']
    start_urls = ['https://www.hamburg.de/branchenbuch/hamburg/10239785/n0/']
    custom_settings = {
        'FEED_EXPORT_FORMAT': 'utf-8'
    }

    def parse(self, response):
        #response=response.body.encode('utf-8')
        items = response.xpath("//div[starts-with(@class, 'item')]")
        for item in items:
            business_name = item.xpath(".//h3[@class='h3rb']/text()").get()
            address1 = item.xpath(".//div[@class='address']/p[@class='extra post']/text()[1]").get()
            address2 = item.xpath(".//div[@class='address']/p[@class='extra post']/text()[2]").get()
            phone = item.xpath(".//div[@class='address']/span[@class='extra phone']/text()").get()

            yield {
                'Business Name': business_name,
                'Address1': address1,
                'Address2': address2,
                'Phone Number': phone
            }

我在代码中放了这一行

custom_settings = { 'FEED_EXPORT_FORMAT': 'utf-8' }

该行本应处理编码问题，但在将结果导出到csv 时，我发现问题仍然存在。我只需要显示网站上显示的文本 Poppenbütteler Bogen 29a sa 示例。我发现输出不同

Answer 1

您的设置名称有误。

FEED_EXPORT_FORMAT 不是 scrapy 默认使用的设置之一，您需要 FEED_EXPORT_ENCODING 代替。

scrapy 中的 Unicode 问题 python

Unicode issue in scrapy python

python

unicode

scrapy