获取元素的属性及其对应的 Id

Question

假设我有这个 xml 文件：

<article-set xmlns:ns0="http://casfwcewf.xsd" format-version="5">
<article>
 <article id="11234">
     <source>
     <hostname>some hostname for 11234</hostname>
     </source>
     <feed>
         <type weight=0.32>RSS</type>
     </feed>
     <uri>some uri for 11234</uri>
 </article>
 <article id="63563">
     <source>
     <hostname>some hostname for 63563 </hostname>
     </source>
     <feed>
         <type weight=0.86>RSS</type>
     </feed>
     <uri>some uri  for 63563</uri>
  </article>
.
.
.
</article></article-set>

我想要的是在整个文档的 RSS 中打印每篇文章 ID 及其特定属性权重（像这样）。

id=11234 
weight= 0.32


id=63563 
weight= 0.86
.
.
.

我用这段代码来做到这一点，

from lxml import etree
tree = etree.parse("C:\Users\Me\Desktop\public.xml")


for article in tree.iter('article'):
    article_id = article.attrib.get('id')

    for weight in tree.xpath("//article[@id={}]/feed/type/@weight".format(article_id)):
        print(article_id,weight)

它没有用，有人可以帮我解决这个问题吗？

Answer 1

其中之一这可能对您有用：

在此版本中，请注意在对 tree.xpath() 的调用中添加了 =：

from lxml import etree
tree = etree.parse("news.xml")


for article in tree.iter('article'):
    article_id = article.attrib.get('id')

    for weight in tree.xpath("//article[@id={}]/feed/type/@weight".format(article_id)):
        print(article_id,weight)

在这里，请注意我将 tree.xpath() 替换为 article.xpath():

from lxml import etree
tree = etree.parse("news.xml")

for article in tree.iter('article'):
    article_id = article.attrib.get('id')

    for weight in article.xpath("./feed/type/@weight"):
        print(article_id,weight)

Answer 2

你可以分两行完成如果你真的想这样做。

>>> from lxml import etree
>>> tree = etree.parse('public.xml')
>>> for item in tree.xpath('.//article[@id]//type[@weight]'):
...     item.xpath('../..')[0].attrib['id'], item.attrib['weight']
... 
('11234', '0.32')
('63563', '0.86')

我使用的一个 xml 检查器坚持在 weight 的值周围加上双引号。 etree 在 xml 上嘎嘎作响，直到我把第一行放到文件中；不知道为什么。

获取元素的属性及其对应的 Id

getting attribute of an element with its corresponding Id

python

xml

lxml

attr