scrapy xpath 无法获取值

scrapy xpath can't get values

我有一个网站,我想保存两个跨度元素值。

这是我的 html 代码的相关部分:

<div class="box-search-product-filter-row">

    <span class="result-numbers" sth-bind="model.navigationSettings.showFilter">

    <span class="number" sth-bind="span1"></span>

    <span class="result" sth-bind="span2"></span>

    </span>

</div>

我创建了一个蜘蛛:

from scrapy.spiders import Spider
from scrapy.selector import Selector

class MySpdier(Spider):

    name = "list"
    allowed_domains = ["example.com"]
    start_urls = [
        "https://www.example.com"]

    def parse(self, response):
        sel = Selector(response)
        divs = sel.xpath("//div[@class='box-search-product-filter-row']")


        for div in divs:
            sth = div.xpath("/span[class='result']/text()").extract()

            print sth

当我抓取蜘蛛时,它只打印这个:

[]

任何人都可以帮助我如何从我的两个(class 数字和 class 结果)span 元素中获取值?

您在 xpath "/span[class='result']/text()" 中忘记了 @。此外,您正在寻找的跨度不是第一级 child,因此您需要使用 .// 而不是 /。看: 资料来源:http://www.w3schools.com/xsl/xpath_syntax.asp

完整和正确的 xpath 将是:".//span[@class='result']" + '/text()' 如果你只想 select 文本,但你的示例中的节点没有文本,所以它不会'我不在这里工作。

这对你有用

编辑:

from scrapy.spiders import Spider
from scrapy.selector import Selector

class MySpdier(Spider):

    name = "list"
    allowed_domains = ["example.com"]
    start_urls = [
        "https://www.example.com"]

    def parse(self, response):
        sel = Selector(response)
        divs = sel.xpath("//div[@class='box-search-product-filter-row']")    

        for div in divs:
            sth = div.xpath(".//span[@class='result']/text()").extract()    
            print sth