将 XML 子元素解析为字符串

Parsing a XML child element back as a string

我正在尝试解析一个复杂的 XML,但 xpath 的行为并不像我想象的那样。 这是我的示例 xml:

<project>
    <samples>
        <sample>show my balance</sample>
        <sample>show me the <subsample value='USD'>money</subsample>today</sample>
    </samples>
</project>

这是我的 python 代码:

from lxml import etree

somenode="<project><samples><sample>show my balance</sample><sample>show me the <subsample value='USD'>money</subsample>today</sample></samples></project>"

somenode_etree = etree.fromstring(somenode)

for x in somenode_etree.iterfind(".//sample"):
    print (etree.tostring(x))

我得到输出:

b'<sample>show my balance</sample><sample>show me the <subsample value="USD">money</subsample>today</sample></samples></project>'
b'<sample>show me the <subsample value="USD">money</subsample>today</sample></samples></project>'

我预期的时间:

show my balance
show me the <subsample value="USD">money</subsample>today

我做错了什么?

此 XPath 将按预期获取文本和元素

result = somenode_etree.xpath(".//sample/text() | .//sample/*")
result
['show my balance', 'show me the ', <Element subsample at 0x7f0516cfa288>, 'today']

按照 OP 的要求打印找到的节点

for x in somenode_etree.xpath(".//sample/text() | .//sample/*[node()]"):
    if type(x) == etree._Element:
        print(etree.tostring(x, method='xml',with_tail=False).decode('UTF-8'))
    else:
        print(x)

结果

show my balance
show me the 
<subsample value="USD">money</subsample>
today

with_tail 参数 prevents tail text to be appended to element.

>>> for x in somenode_etree.xpath(".//sample/text() | .//sample/*"):
...     if type(x) == etree._Element:
...         print(x.text)
...     else:
...         print(x)
... 
show my balance
show me the 
money
today