通过 python 解析嵌套的 xml 给出空列表而不是标记值

Parsing nested xml via python giving empty list instead of tag values

我想从以下 xml (SOAP API):

中获取所有 Id 标签值
<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
   <infasoapns:Body xmlns:eAPI="http://api.ppdi.com/1.1/Site" xmlns:infasoapns="http://schemas.xmlsoap.org/soap/envelope/" xmlns:infawsdlns="http://schemas.xmlsoap.org/wsdl/">
      <eAPI:getSiteResponse>
         <eAPI:SITE>
            <eAPI:Id>CTMSR_1-1036KJ</eAPI:Id>
            <eAPI:Sponsor>Ell Inc</eAPI:Sponsor>
            <eAPI:CRO>PDP</eAPI:CRO>
            <eAPI:Protocol_Number>EL184-308</eAPI:Protocol_Number>
            <eAPI:Protocol_Id>CTMSR_1-LCXB0</eAPI:Protocol_Id>
        </eAPI:SITE>
        <eAPI:SITE>
            <eAPI:Id>CTMSR_1-1036SM</eAPI:Id>
            <eAPI:Sponsor>Ell Inc</eAPI:Sponsor>
            <eAPI:CRO>PDP</eAPI:CRO>
            <eAPI:Protocol_Number>EL184-308</eAPI:Protocol_Number>
            <eAPI:Protocol_Id>CTMSR_1-LCXB0</eAPI:Protocol_Id>
        </eAPI:SITE>
        <eAPI:SITE>
            <eAPI:Id>CTMSR_1-1036SM</eAPI:Id>
            <eAPI:Sponsor>Ell Inc</eAPI:Sponsor>
            <eAPI:CRO>PDP</eAPI:CRO>
            <eAPI:Protocol_Number>EL184-308</eAPI:Protocol_Number>
            <eAPI:Protocol_Id>CTMSR_1-LCXB0</eAPI:Protocol_Id>
        </eAPI:SITE>
      </eAPI:getSiteResponse>
   </infasoapns:Body>
</soapenv:Envelope>

我写的代码在下面,在输出中给出了空列表

  1. 当我运行tree.findall('.//Id')时,它给出了输出:[]
  2. 当我运行打印(tree.find('Id'))时,它给出了输出:None
  3. 当我 运行 tree.find('Id').text 时,它给出了输出:

AttributeError Traceback(最近调用最后) 在 ----> 1 tree.find('Id').text

AttributeError: 'NoneType' 对象没有属性 'text'

代码:

>>> import xml.etree.cElementTree as ElementTree
>>> file_path = 'C:\Users\dshukla\Desktop\docs\PPD project\Response\WS_SITES_1.1_RES'
>>> tree = ElementTree.parse(file_path)
>>> root = tree.getroot()
>>> print(root)
<Element '{http://schemas.xmlsoap.org/soap/envelope/}Envelope' at 0x00000238F6BFCD10>
>>> tree.findall('.//Id')
[]
>>> tree.find('Id')
>>> print(tree.find('Id'))
None
>>> tree.find('Id').text
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'text'
>>>                                                           

**为什么我得到空 list/None 类型错误?如何从此 xml 文件中获取 ID 标签的值? **

您需要将名称空间传递给 findall()。您的文件中有四个命名空间。

import xml.etree.cElementTree as ElementTree
file_path = 'yourfile.xml'  # change to your file path

tree = ElementTree.parse(file_path)

root = tree.getroot()

namespaces = {"soapenv": "http://schemas.xmlsoap.org/soap/envelope/",
              "eAPI": "http://api.ppdi.com/1.1/Site",
              "infasoapns": "http://schemas.xmlsoap.org/soap/envelope/",
              "infawsdlns": "http://schemas.xmlsoap.org/wsdl/"}

names = root.findall('*/eAPI:getSiteResponse/eAPI:SITE/eAPI:Id', namespaces)  # pass namespaces like this

for name in names:
    print(name.text)

这是结果:

CTMSR_1-1036KJ
CTMSR_1-1036SM
CTMSR_1-1036SM