无法在不将 XMLDoc 放在 DOM 中的情况下评估 XPath

Question

我只能使用 XPath 从 DOM 中获取结果节点，感觉不正确。

设置：

我正试图在我的 HTML 页面上显示 XML 文档 (TEI/XML) 的片段。我有 XML 文档的 URL 和片段的 XPath 选择器。我想我可以 fetch() 文档并提取我想要的部分：

// Real values, for one case, 
// t.source = "https://centerfordigitalhumanities.github.io/Dunbar-books/The-Complete-Poems-TEI.xml"
// t.selector.value = "//div[@type='poem'][8]"

const sampleSource = await fetch(t.source)
  .then(res => res.text())
  .then(docStr => (new DOMParser()).parseFromString(docStr, "application/xml"))

const poemText = sampleSource.evaluate(t.selector?.value, sampleSource, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null)

textSample.innerHTML = poemText.snapshotItem(0).innerHTML

无结果

尝试了几种不同的方法（更改 contextNode、使用 XPathSelector.evaluate() 代替 XMLDoc.evaluate() 以及更改 XPathResult）结果始终为空。

无奈之下，我尝试了越来越简单的选择器，发现 evaluate() 只遍历了我当前的 HTML document，尽管没有引用它。

解决方法

将 XML 文档转储到页面上的隐藏元素中“有效”。

const sampleSource = await fetch(t.source)
  .then(res => res.text())
  .then(docStr => hiddenElem.innerHTML = docStr)

const poemText = document.evaluate(t.selector?.value, hiddenElem, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null)

textSample.innerHTML = poemText.snapshotItem(0).innerHTML

问题

evaluate() 只遍历 document 是这样吗？
有没有比我的解决方法更好的做法？

Answer 1

好吧，它是一个 TEI 文档，所以它的元素在命名空间 http://www.tei-c.org/ns/1.0 中，不要指望对 XML DOM 文档和 select 或类似任何命名空间中的 div 到 select 元素，它正好是 selects div 没有命名空间中的元素。对于具有 XPath 1.0 的命名空间中的 select 元素，您需要使用 evaluate 的第三个参数并将您可以选择的前缀（如 tei）绑定到该命名空间并使用例如//tei:div[@type='poem'][8]:

<script type=module>
const sampleSource = await fetch('https://centerfordigitalhumanities.github.io/Dunbar-books/The-Complete-Poems-TEI.xml')
  .then(res => res.text())
  .then(docStr => (new DOMParser()).parseFromString(docStr, "application/xml"));

const poemText = sampleSource.evaluate(`//tei:div[@type='poem'][8]`, sampleSource, prefix => prefix === 'tei' ? 'http://www.tei-c.org/ns/1.0' : null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);

console.log(poemText.snapshotItem(0).textContent);
</script>

使用 XPath 2 或 3，例如 Saxon-JS 2 支持，您可以绑定默认元素命名空间，并在该命名空间中使用 div 到 select 等非限定命名元素。

<script src=https://www.saxonica.com/saxon-js/documentation/SaxonJS/SaxonJS2.rt.js></script>

<script type=module>
    const sampleSource = await SaxonJS.getResource({ location : 'https://centerfordigitalhumanities.github.io/Dunbar-books/The-Complete-Poems-TEI.xml', type : 'xml' });


    const poemText = SaxonJS.XPath.evaluate(`//div[@type='poem'][8]`, sampleSource, { xpathDefaultNamespace : 'http://www.tei-c.org/ns/1.0' });

    console.log(poemText.textContent);
</script>

在 XPath 1.0 中没有办法，除非 DOM 环境允许您构建一个更少 DOM 的命名空间（例如 Java 使用非命名空间感知的 DocumentBuilder）。但据我所知，在浏览器内部并非如此。

无法在不将 XMLDoc 放在 DOM 中的情况下评估 XPath

Cannot evaluate XPath on XMLDoc without placing it in the DOM

javascript

xml

xpath

tei

设置：

无结果

解决方法

问题