在带有 Scrapy 的 Xpath 中使用 Following-sibling
Using Following-sibling in Xpath with Scrapy
我正在尝试从下面的 html (https://www.espncricinfo.com/series/indian-premier-league-2022-1298423/punjab-kings-vs-delhi-capitals-64th-match-1304110/full-scorecard) 中提取年份。由于网站的编码方式,我必须首先识别包含“季节”一词的 table 单元格,然后获取年份(本例中为 2022)。
我以为这会得到它,但它没有。没有错误,只是没有结果。我以前没有使用过 following-sibling
方法,所以如果有人能指出我搞砸的地方,我将不胜感激。
l.add_xpath(
'Season',
"//td[contains(text(),'Season')]/following-sibling::td[1]/a/text()")
html:
<tr class="ds-border-b ds-border-line">
<td class="ds-min-w-max ds-border-r ds-border-line">
<span class="ds-text-tight-s ds-font-medium">Season</span>
</td>
<td class="ds-min-w-max">
<span class="ds-inline-flex ds-items-center ds-leading-none">
<a href="https://www.espncricinfo.com/ci/engine/series/index.html?season2022" class="ds-text-ui-typo ds-underline ds-underline-offset-4 ds-decoration-ui-stroke hover:ds-text-ui-typo-primary hover:ds-decoration-ui-stroke-primary ds-block">
<span class="ds-text-tight-s ds-font-medium">2022</span>
</a>
</span>
</td>
</tr>
尝试:
//span[contains(text(),"Season")]/../following-sibling::td/span/a/span/text()
我正在尝试从下面的 html (https://www.espncricinfo.com/series/indian-premier-league-2022-1298423/punjab-kings-vs-delhi-capitals-64th-match-1304110/full-scorecard) 中提取年份。由于网站的编码方式,我必须首先识别包含“季节”一词的 table 单元格,然后获取年份(本例中为 2022)。
我以为这会得到它,但它没有。没有错误,只是没有结果。我以前没有使用过 following-sibling
方法,所以如果有人能指出我搞砸的地方,我将不胜感激。
l.add_xpath(
'Season',
"//td[contains(text(),'Season')]/following-sibling::td[1]/a/text()")
html:
<tr class="ds-border-b ds-border-line">
<td class="ds-min-w-max ds-border-r ds-border-line">
<span class="ds-text-tight-s ds-font-medium">Season</span>
</td>
<td class="ds-min-w-max">
<span class="ds-inline-flex ds-items-center ds-leading-none">
<a href="https://www.espncricinfo.com/ci/engine/series/index.html?season2022" class="ds-text-ui-typo ds-underline ds-underline-offset-4 ds-decoration-ui-stroke hover:ds-text-ui-typo-primary hover:ds-decoration-ui-stroke-primary ds-block">
<span class="ds-text-tight-s ds-font-medium">2022</span>
</a>
</span>
</td>
</tr>
尝试:
//span[contains(text(),"Season")]/../following-sibling::td/span/a/span/text()