使用 Google Sheets ImportXML 中的 Xpath 查询函数从维基百科获取数据

Question

将 Wikipedia 中的数据提取到 Google 表格中的正确 Xpath 查询是什么？

这是一个我想测试的例子：

维基百科页面：http://en.wikipedia.org/wiki/12_Angry_Men_(1957_film)

要拉取的数据：位于右侧table

的“96分钟”的"running time"值

方法：使用Google Sheets ImportXML 函数

我尝试了以下方法，但它 returns N/A:

=IMPORTXML("http://en.wikipedia.org/wiki/12_Angry_Men_(1957_film)", "//div[normalize-space() = 'Running time']/following-sibling::td")

谢谢！

Answer 1

您的 XPath 有几个问题。

following-sibling 轴不适用于该页面的标记，因为 'Running time' div 之后的 td 是其父级 [=14] 的同级=].相反，使用节点类型 select 或 following::td 的 following 轴。但是，在 selected div 之后仍然 returns 所有 td 节点，因此我们还需要一个谓词来 select 只有第一个节点：[1].

使用 XPath 完成函数：

=IMPORTXML("http://en.wikipedia.org/wiki/12_Angry_Men_%281957_film%29", "//div[normalize-space()='Running time']/following::td[1]")

使用 Google Sheets ImportXML 中的 Xpath 查询函数从维基百科获取数据

Fetching data from Wikipedia using Xpath Query function in Google Sheets ImportXML

google-sheets

xpathquery