Xpath

Question

我使用的 XPath 与此类似：

//*[ends-with(@id,'eoId')]/table/tbody/tr/td

并将 table 数据提取为数组。例如。 table 的结构

<tr>
  <td>1</td>
  <td>2</td>
  <td>3</td>
  <td>4</td>
</tr>
<tr>
  <td>5</td>
  <td>6</td>
  <td>7</td>
  <td>8</td>
</tr>

呈现为

1 2 3 4
5 6 7 8

提取为数组

1 2 3 4 5 6 7 8

数组。

结果我需要的格式如下：

1 5 2 6 3 7 4 8

我可以使用 XPATH 2.0。谢谢。

Answer 1

考虑在 <tr> 节点上使用 xslt 进行轻微转换，然后使用 xpath query()。各种编程语言和软件都有 XSLT 处理器，包括 Java、Python、PHP、Excel/Access 和 VBA、Saxon 等：

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8"/>

<xsl:template name="tdsort" match="tr" >
  <tr>
    <xsl:for-each select="//td">                          
          <xsl:sort select="count(preceding-sibling::*) + 1"/>
          <xsl:copy-of select="."/>
    </xsl:for-each>             

  </tr>
</xsl:template>

</xsl:stylesheet>

原始 xml 中的所有其他节点可以保持与 <xsl apply-templates select="."/> 相同的结构。上面的样式表会将原始 xml 转换为以下样式：

<tr>
  <td>1</td>
  <td>5</td>
  <td>2</td>
  <td>6</td>
  <td>3</td>
  <td>7</td>
  <td>4</td>
  <td>8</td>
</tr>

从那里你可以使用 XPath：

 //*[ends-with(@id,'eoId')]/table/tbody/tr/td

Answer 2

这是执行此操作的 XPath 2.0 表达式：

for $table in //*[ends-with(@id,'eoId')]/table,
    $col in 1 to count($table/tr[1]/td),
    $row in 1 to count($table/tr)
  return $table/tr[$row]/td[$col]/text()

Try it here.

输入：

<div id="a_eoId">
<table>
<tr>
  <td>1</td>
  <td>2</td>
  <td>3</td>
  <td>4</td>
</tr>
<tr>
  <td>5</td>
  <td>6</td>
  <td>7</td>
  <td>8</td>
</tr>
</table>
</div>

结果：序列

1 5个 2个 6个 3个 7 4个 8

此 XPath 解决方案假定所有行都具有相同的列数（或者至少，没有一行的列数多于第一行）。

Xpath - 逐列提取列

Xpath - extract column after column

xml

html-table

extract