如何解析 XML 并存储为列表 (python)
How to parse XML and store as a list ( python)
不好意思再问一遍。
我想通过 xml.etree.ElementTree
将 xml 文件转换为 excel。
假设我的 xml 看起来像:
<ParameterCluster>
<Name>AAAAAA</Name>
<ParameterDefinitionList>
<ParameterDefinition>
<Name>LengthMin</Name>
<Type>UInt8</Type>
</ParameterDefinition>
<ParameterDefinition>
<Name>LengthMax</Name>
<Type>UInt8</Type>
</ParameterDefinition>
</ParameterDefinitionList>
<VariantImlementationList>
<VariantImlementation>
<MajorVariantList>
<MajorVariant>A_Basis</MajorVariant>
</MajorVariantList>
<MinorVariantList>
<ParameterValue>
<ValueList>
<Value>47</Value>
</ValueList>
<ValueList>
<Value>80</Value>
</ValueList>
</ParameterValue>
</MinorVariantList>
<MajorVariantList>
<MajorVariant>B_Basis</MajorVariant>
<MajorVariant>C_Basis</MajorVariant>
</MajorVariantList>
<MinorVariantList>
<ParameterValue>
<ValueList>
<Value>47</Value>
</ValueList>
<ValueList>
<Value>40</Value>
</ValueList>
</ParameterValue>
</MinorVariantList>
</VariantImlementation>
</VariantImlementationList>
</ParameterCluster>
也就是说,我有3
个基础(A_basis
、B_basis
、C_basis
)。
而在A_ Basis
中,LengthMin
的值为47
,LengthMax
的值为80
。
但在 B_basis
和 C_Basis
中。 LengthMin
的值是 47
而 LengthMax
的值是 40
.
所以我想得到类似的东西:
{'AAAAAA','LengthMin','UInt8','A_Basis',47}
{'AAAAAA','LengthMax','UInt8','A_Basis',80}
{'AAAAAA','LengthMin','UInt8','B_Basis',47}
{'AAAAAA','LengthMax','UInt8','B_Basis',40}
{'AAAAAA','LengthMin','UInt8','C_Basis',47}
{'AAAAAA','LengthMax','UInt8','C_Basis',40}
然后我可以将它写入excel文件。有可能得到那种名单吗?
解析XML可以使用BeautifulSoup
代替xml.etree.ElementTree
(界面更直观)。
解析很简单(假设 ParameterValue
的长度始终与 ParameterValue.ValueList
相同:首先您需要提取参数类型,然后遍历所有 <MajorVariant>
和填充结果列表。
如果 BeautifulSoup 不是问题,这里是示例代码:
data = """<ParameterCluster>
<Name>AAAAAA</Name>
<ParameterDefinitionList>
<ParameterDefinition>
<Name>LengthMin</Name>
<Type>UInt8</Type>
</ParameterDefinition>
<ParameterDefinition>
<Name>LengthMax</Name>
<Type>UInt8</Type>
</ParameterDefinition>
</ParameterDefinitionList>
<VariantImlementationList>
<VariantImlementation>
<MajorVariantList>
<MajorVariant>A_Basis</MajorVariant>
</MajorVariantList>
<MinorVariantList>
<ParameterValue>
<ValueList>
<Value>47</Value>
</ValueList>
<ValueList>
<Value>80</Value>
</ValueList>
</ParameterValue>
</MinorVariantList>
<MajorVariantList>
<MajorVariant>B_Basis</MajorVariant>
<MajorVariant>C_Basis</MajorVariant>
</MajorVariantList>
<MinorVariantList>
<ParameterValue>
<ValueList>
<Value>47</Value>
</ValueList>
<ValueList>
<Value>40</Value>
</ValueList>
</ParameterValue>
</MinorVariantList>
</VariantImlementation>
</VariantImlementationList>
</ParameterCluster>"""
from bs4 import BeautifulSoup
from pprint import pprint
soup = BeautifulSoup(data, 'xml')
name, types = soup.select_one('Name'), []
for n, t in zip(soup.select('ParameterDefinitionList Name'), soup.select('ParameterDefinitionList Type')):
types.append([name.text, n.text, t.text])
rv = []
for major, minor in zip(soup.select('MajorVariantList'), soup.select('MajorVariantList ~ MinorVariantList')):
for mj in major.select('MajorVariant'):
for i, mn in enumerate(minor.select('Value')):
rv.append(types[i] + [mj.text, mn.text])
pprint(rv, width=120)
输出:
[['AAAAAA', 'LengthMin', 'UInt8', 'A_Basis', '47'],
['AAAAAA', 'LengthMax', 'UInt8', 'A_Basis', '80'],
['AAAAAA', 'LengthMin', 'UInt8', 'B_Basis', '47'],
['AAAAAA', 'LengthMax', 'UInt8', 'B_Basis', '40'],
['AAAAAA', 'LengthMin', 'UInt8', 'C_Basis', '47'],
['AAAAAA', 'LengthMax', 'UInt8', 'C_Basis', '40']]
不好意思再问一遍。
我想通过 xml.etree.ElementTree
将 xml 文件转换为 excel。
假设我的 xml 看起来像:
<ParameterCluster> <Name>AAAAAA</Name> <ParameterDefinitionList> <ParameterDefinition> <Name>LengthMin</Name> <Type>UInt8</Type> </ParameterDefinition> <ParameterDefinition> <Name>LengthMax</Name> <Type>UInt8</Type> </ParameterDefinition> </ParameterDefinitionList> <VariantImlementationList> <VariantImlementation> <MajorVariantList> <MajorVariant>A_Basis</MajorVariant> </MajorVariantList> <MinorVariantList> <ParameterValue> <ValueList> <Value>47</Value> </ValueList> <ValueList> <Value>80</Value> </ValueList> </ParameterValue> </MinorVariantList> <MajorVariantList> <MajorVariant>B_Basis</MajorVariant> <MajorVariant>C_Basis</MajorVariant> </MajorVariantList> <MinorVariantList> <ParameterValue> <ValueList> <Value>47</Value> </ValueList> <ValueList> <Value>40</Value> </ValueList> </ParameterValue> </MinorVariantList> </VariantImlementation> </VariantImlementationList> </ParameterCluster>
也就是说,我有3
个基础(A_basis
、B_basis
、C_basis
)。
而在A_ Basis
中,LengthMin
的值为47
,LengthMax
的值为80
。
但在 B_basis
和 C_Basis
中。 LengthMin
的值是 47
而 LengthMax
的值是 40
.
所以我想得到类似的东西:
{'AAAAAA','LengthMin','UInt8','A_Basis',47}
{'AAAAAA','LengthMax','UInt8','A_Basis',80}
{'AAAAAA','LengthMin','UInt8','B_Basis',47}
{'AAAAAA','LengthMax','UInt8','B_Basis',40}
{'AAAAAA','LengthMin','UInt8','C_Basis',47}
{'AAAAAA','LengthMax','UInt8','C_Basis',40}
然后我可以将它写入excel文件。有可能得到那种名单吗?
解析XML可以使用BeautifulSoup
代替xml.etree.ElementTree
(界面更直观)。
解析很简单(假设 ParameterValue
的长度始终与 ParameterValue.ValueList
相同:首先您需要提取参数类型,然后遍历所有 <MajorVariant>
和填充结果列表。
如果 BeautifulSoup 不是问题,这里是示例代码:
data = """<ParameterCluster>
<Name>AAAAAA</Name>
<ParameterDefinitionList>
<ParameterDefinition>
<Name>LengthMin</Name>
<Type>UInt8</Type>
</ParameterDefinition>
<ParameterDefinition>
<Name>LengthMax</Name>
<Type>UInt8</Type>
</ParameterDefinition>
</ParameterDefinitionList>
<VariantImlementationList>
<VariantImlementation>
<MajorVariantList>
<MajorVariant>A_Basis</MajorVariant>
</MajorVariantList>
<MinorVariantList>
<ParameterValue>
<ValueList>
<Value>47</Value>
</ValueList>
<ValueList>
<Value>80</Value>
</ValueList>
</ParameterValue>
</MinorVariantList>
<MajorVariantList>
<MajorVariant>B_Basis</MajorVariant>
<MajorVariant>C_Basis</MajorVariant>
</MajorVariantList>
<MinorVariantList>
<ParameterValue>
<ValueList>
<Value>47</Value>
</ValueList>
<ValueList>
<Value>40</Value>
</ValueList>
</ParameterValue>
</MinorVariantList>
</VariantImlementation>
</VariantImlementationList>
</ParameterCluster>"""
from bs4 import BeautifulSoup
from pprint import pprint
soup = BeautifulSoup(data, 'xml')
name, types = soup.select_one('Name'), []
for n, t in zip(soup.select('ParameterDefinitionList Name'), soup.select('ParameterDefinitionList Type')):
types.append([name.text, n.text, t.text])
rv = []
for major, minor in zip(soup.select('MajorVariantList'), soup.select('MajorVariantList ~ MinorVariantList')):
for mj in major.select('MajorVariant'):
for i, mn in enumerate(minor.select('Value')):
rv.append(types[i] + [mj.text, mn.text])
pprint(rv, width=120)
输出:
[['AAAAAA', 'LengthMin', 'UInt8', 'A_Basis', '47'],
['AAAAAA', 'LengthMax', 'UInt8', 'A_Basis', '80'],
['AAAAAA', 'LengthMin', 'UInt8', 'B_Basis', '47'],
['AAAAAA', 'LengthMax', 'UInt8', 'B_Basis', '40'],
['AAAAAA', 'LengthMin', 'UInt8', 'C_Basis', '47'],
['AAAAAA', 'LengthMax', 'UInt8', 'C_Basis', '40']]