如何使用 XPath 通过其兄弟节点的属性来识别 XML 节点?
How to spot an XML-node by attributes of its sibling using XPath?
假设我有以下XML-文件(可有可无的部分标有'...'):
<?xml version="1.0" encoding="ISO-8859-1"?>
<PARAMETERS version="1.6.2" xsi:noNamespaceSchemaLocation="http://open-ms.sourceforge.net/schemas/Param_1_6_2.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<NODE name="info" description="">
<ITEM name="version" value="2.0.0" type="string" description="" required="false" advanced="false" />
<ITEM name="num_vertices" value="5" type="int" description="" required="false" advanced="false" />
<ITEM name="num_edges" value="4" type="int" description="" required="false" advanced="false" />
<ITEM name="description" value="<![CDATA[]]>" type="string" description="" required="false" advanced="false" />
</NODE>
<NODE name="vertices" description="">
<NODE name="0" description="">
<ITEM name="recycle_output" value="false" type="string" description="" required="false" advanced="false" />
<ITEM name="toppas_type" value="input file list" type="string" description="" required="false" advanced="false" />
<ITEMLIST name="file_names" type="string" description="" required="false" advanced="false">
<LISTITEM value="input_data/STD_MIX_1_25_neg.mzML"/>
</ITEMLIST>
<ITEM name="x_pos" value="-1680" type="double" description="" required="false" advanced="false" />
<ITEM name="y_pos" value="-620" type="double" description="" required="false" advanced="false" />
</NODE>
<NODE name="1" description="">
...
</NODE>
...
</NODE>
</PARAMETERS>
我的目标是进行 XPath 查询,returns 具有属性名称="file_names" 的 ITEMLIST 节点和具有属性名称="toppas_type"、值= 的同级 ITEM 节点"input file list"。我尝试了以下一个:
'./NODE/NODE[ITEM[@name="toppas_type"][@value="input file list"]]/ITEMLIST[@name="file_names"]'
在 Python 3.4 中使用 xml.etree.ElementTree,但出现错误 'invalid predicate'。我认为我的查询包含一个愚蠢的错误,但我找不到它。
xml.etree.ElementTree
有一个 limited XPath support:
This module provides limited support for XPath expressions for
locating elements in a tree. The goal is to support a small subset of
the abbreviated syntax; a full XPath engine is outside the scope of
the module.
如果您可以切换到 lxml
, it can be solved by using following-sibling
轴:
//ITEM[@name = 'toppas_type' and @value = 'input file list']/following-sibling::ITEMLIST[@name = 'file_names']
假设我有以下XML-文件(可有可无的部分标有'...'):
<?xml version="1.0" encoding="ISO-8859-1"?>
<PARAMETERS version="1.6.2" xsi:noNamespaceSchemaLocation="http://open-ms.sourceforge.net/schemas/Param_1_6_2.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<NODE name="info" description="">
<ITEM name="version" value="2.0.0" type="string" description="" required="false" advanced="false" />
<ITEM name="num_vertices" value="5" type="int" description="" required="false" advanced="false" />
<ITEM name="num_edges" value="4" type="int" description="" required="false" advanced="false" />
<ITEM name="description" value="<![CDATA[]]>" type="string" description="" required="false" advanced="false" />
</NODE>
<NODE name="vertices" description="">
<NODE name="0" description="">
<ITEM name="recycle_output" value="false" type="string" description="" required="false" advanced="false" />
<ITEM name="toppas_type" value="input file list" type="string" description="" required="false" advanced="false" />
<ITEMLIST name="file_names" type="string" description="" required="false" advanced="false">
<LISTITEM value="input_data/STD_MIX_1_25_neg.mzML"/>
</ITEMLIST>
<ITEM name="x_pos" value="-1680" type="double" description="" required="false" advanced="false" />
<ITEM name="y_pos" value="-620" type="double" description="" required="false" advanced="false" />
</NODE>
<NODE name="1" description="">
...
</NODE>
...
</NODE>
</PARAMETERS>
我的目标是进行 XPath 查询,returns 具有属性名称="file_names" 的 ITEMLIST 节点和具有属性名称="toppas_type"、值= 的同级 ITEM 节点"input file list"。我尝试了以下一个:
'./NODE/NODE[ITEM[@name="toppas_type"][@value="input file list"]]/ITEMLIST[@name="file_names"]'
在 Python 3.4 中使用 xml.etree.ElementTree,但出现错误 'invalid predicate'。我认为我的查询包含一个愚蠢的错误,但我找不到它。
xml.etree.ElementTree
有一个 limited XPath support:
This module provides limited support for XPath expressions for locating elements in a tree. The goal is to support a small subset of the abbreviated syntax; a full XPath engine is outside the scope of the module.
如果您可以切换到 lxml
, it can be solved by using following-sibling
轴:
//ITEM[@name = 'toppas_type' and @value = 'input file list']/following-sibling::ITEMLIST[@name = 'file_names']