Python: 'NoneType' 对象没有属性 'text', XML 正在解析

Question

我正在尝试使用 Spirit 获取一个 XML 文件，解析数据并输出为 csv 文件。我觉得我忽略了一些简单的事情。我有一个错误：

Traceback (most recent call last):
  File "xml2csv.py", line 12, in <module>
    name = i.find("spirit:name").text
AttributeError: 'NoneType' object has no attribute 'text'

示例XML 文件：

<?xml version="1.0" encoding="utf-8"?>
<spirit:component xmlns="http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.5"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:spirit="http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.5"
  xsi:schemaLocation="http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.5 http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.5/memoryMap.xsd">
>
<spirit:generic>
<spirit:name>GENERIC_NAME</spirit:name>
<spirit:description>GENERIC_DESCRIPTION</spirit:description>
</spirit:generic>
</spirit:component>

我的Python代码：

# Importing the required libraries
import xml.etree.ElementTree as Xet
import pandas as pd
  
cols = ["name", "description"]
rows = []
  
# Parsing the XML file
xmlparse = Xet.parse('xml_sample.xml')
root = xmlparse.getroot()
for i in root:
    name = i.find("spirit:name").text
    description = i.find("spirit:description").text
  
    rows.append({"spirit:name": name,
                 "spirit:description": description})
  
df = pd.DataFrame(rows, columns=cols)
  
# Writing dataframe to csv
df.to_csv('output.csv')

我怀疑我的错误是在我的“.text”中，基于我正在阅读的一些其他主题。但是，删除它会导致我的 .csv 文件不显示任何数据。

CSV:

,name,description
0,,

如有任何建议，我们将不胜感激。有点坚持这个。

Answer 1

以下似乎可行。注意代码使用的命名空间：{http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.5}

import xml.etree.ElementTree as ET
import pandas as pd

xml = '''<?xml version="1.0" encoding="utf-8"?>
<spirit:component xmlns="http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.5"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:spirit="http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.5"
  xsi:schemaLocation="http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.5 http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.5/memoryMap.xsd">
>
<spirit:generic>
<spirit:name>GENERIC_NAME</spirit:name>
<spirit:description>GENERIC_DESCRIPTION</spirit:description>
</spirit:generic>
</spirit:component>'''

cols = ["name", "description"]
rows = []
root = ET.fromstring(xml)
names = [x.text for x in root.findall('.//{http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.5}name')]
descriptions = [x.text for x in root.findall('.//{http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.5}description')]
for entry in zip(names, descriptions):
    rows.append({'name': entry[0], 'description': entry[1]})

df = pd.DataFrame(rows, columns=cols)
print(df)

输出

           name          description
0  GENERIC_NAME  GENERIC_DESCRIPTION

Python: 'NoneType' 对象没有属性 'text', XML 正在解析

Python: 'NoneType' object has no attribute 'text', XML Parsing

python

xml

elementtree

pandas