在 Python/Pandas 中连接重复的名为 XML 的标签

Question

有没有办法连接重复命名标签中的文本？

示例xml：

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor>"Austria"</neighbor>
        <neighbor>"Switzerland"</neighbor>
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor>"Malaysia"</neighbor>
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor>"Costa Rica"</neighbor>
        <neighbor>"Colombia"</neighbor>
    </country>
</data>

这是我目前拥有的：

from xml.etree import ElementTree as ET

tree = ET.parse('sample.xml')
root = tree.getroot()

for acct_det in root.iter('neighbor'):
    print(acct_det.text)

我想做的是从 neighbor 标签中拼接字符串：

Austria Switzerland
Malaysia
Costa Rica Colombia

我无法找到解决方案来完成此任务。

Answer 1

from lxml import etree 

tree = etree.parse('tmp.xml')

slist = tree.xpath('//country')
for d in slist:
    print( d.xpath('concat(./neighbor[1]/text(), " ", ./neighbor[2]/text())'))

结果

"Austria" "Switzerland"
"Malaysia" 
"Costa Rica" "Colombia"

Answer 2

from xml.etree import ElementTree as ET

tree = ET.parse("sample.xml")
root = tree.getroot()

for country in root:
    neighbors = " ".join([n.text.strip('"') for n in country.findall("neighbor")])
    print(neighbors)

在 Python/Pandas 中连接重复的名为 XML 的标签

Concatenate duplicate named XML tags in Python/Pandas

python

xml

pandas