XML 使用具有多个键值的相同标签来字典 Python
XML to dict Python with same tags that have multiple keys values
针对 XML Dict 提出了很多解决方案,但我无法解决我的特定用例。
我的XML格式是有多个相同的标签,但每个标签内可能有很多键值,而且并非所有标签都有一致数量的键值。这使它具有挑战性。
例如
<?xml version="1.0" encoding="UTF-8"?>
<mxfile host="xxx.xxx.com" modified="2021-06-14T07:52:04.437Z" agent="xxx" version="12.4.8" etag="o-cccc" type="device">
<diagram id="asdfsdf">
<mxGraphModel dx="1213" dy="2767" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="827" pageHeight="1169" math="0" shadow="0">
<root>
<mxCell id="0"/>
<mxCell id="1" parent="0"/>
<mxCell id="2" value="label_1" style="points=[[0,0],[0.25,0],[0.5,0],[0.75,0],[1,0],[1,0.25],[1,0.5],[1,0.75],[1,1],[0.75,1],[0.5,1],[0.25,1],[0,1],[0,0.75],[0,0.5],[0,0.25]];outlineConnect=0;gradientColor=none;html=1;whiteSpace=wrap;fontSize=17;fontStyle=0;shape=shape_1;grIcon=icon_1;strokeColor=#232F3E;fillColor=none;verticalAlign=top;align=left;spacingLeft=30;fontColor=#232F3E;dashed=0;" vertex="1" parent="1">
<mxGeometry x="110" y="-50" width="1170" height="840" as="geometry"/>
</mxCell>
<mxCell id="3" value="Region" style="points=[[0,0],[0.25,0],[0.5,0],[0.75,0],[1,0],[1,0.25],[1,0.5],[1,0.75],[1,1],[0.75,1],[0.5,1],[0.25,1],[0,1],[0,0.75],[0,0.5],[0,0.25]];outlineConnect=0;gradientColor=none;html=1;whiteSpace=wrap;fontSize=17;fontStyle=0;shape=shape_1;grIcon=icon_2;strokeColor=#147EBA;fillColor=none;verticalAlign=top;align=left;spacingLeft=30;fontColor=#147EBA;dashed=0;" vertex="1" parent="1">
<mxGeometry x="290" y="190" width="960" height="580" as="geometry"/>
</mxCell>
<mxCell id="4" value="Area 1" style="fillColor=none;strokeColor=#147EBA;dashed=1;verticalAlign=top;fontStyle=0;fontColor=#147EBA;fontSize=17;" vertex="1" parent="1">
<mxGeometry x="750" y="340" width="320" height="420" as="geometry"/>
</mxCell>
<mxCell id="5" value="Area 1" style="fillColor=none;strokeColor=#147EBA;dashed=1;verticalAlign=top;fontStyle=0;fontColor=#147EBA;fontSize=17;" vertex="1" parent="1">
<mxGeometry x="326" y="340" width="364" height="420" as="geometry"/>
</mxCell>
<mxCell id="6" value="" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;fontSize=17;" edge="1" source="7" target="9" parent="1">
<mxGeometry relative="1" as="geometry"/>
</mxCell>
<mxCell id="7" value="" style="outlineConnect=0;fontColor=#232F3E;gradientColor=none;fillColor=#232F3E;strokeColor=none;dashed=0;verticalLabelPosition=bottom;verticalAlign=top;align=center;html=1;fontSize=17;fontStyle=0;aspect=fixed;pointerEvents=1;shape=shape_3;" vertex="1" parent="1">
<mxGeometry x="698.43" y="-110" width="34" height="34" as="geometry"/>
</mxCell>
<mxCell id="8" value="" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;fontSize=17;" edge="1" source="9" target="35" parent="1">
<mxGeometry relative="1" as="geometry"/>
</mxCell>
<mxCell id="9" value="" style="outlineConnect=0;fontColor=#232F3E;gradientColor=#945DF2;gradientDirection=north;fillColor=#5A30B5;strokeColor=#ffffff;dashed=0;verticalLabelPosition=bottom;verticalAlign=top;align=center;html=1;fontSize=17;fontStyle=0;aspect=fixed;shape=shape_1;resIcon=service_2;" vertex="1" parent="1">
<mxGeometry x="675.43" y="-30" width="80" height="80" as="geometry"/>
</mxCell>
<mxCell id="24" value="<font style="font-size: 15px">Service name 1</font>" style="outlineConnect=0;fontColor=#232F3E;gradientColor=none;strokeColor=#ffffff;fillColor=#232F3E;dashed=0;verticalLabelPosition=middle;verticalAlign=bottom;align=center;html=1;whiteSpace=wrap;fontSize=17;fontStyle=1;spacing=3;shape=shape_2;prIcon=service_2;" vertex="1" parent="1">
<mxGeometry x="159" width="62" height="100" as="geometry"/>
</mxCell>
<mxCell id="25" value="" style="outlineConnect=0;fontColor=#232F3E;gradientColor=none;fillColor=#D86613;strokeColor=none;dashed=0;verticalLabelPosition=bottom;verticalAlign=top;align=center;html=1;fontSize=17;fontStyle=0;aspect=fixed;pointerEvents=1;shape=shape_3;" vertex="1" parent="1">
<mxGeometry x="817.1399999999999" y="383" width="64" height="64" as="geometry"/>
</mxCell>
<mxCell id="26" value="" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;fontSize=17;" edge="1" source="27" target="28" parent="1">
<mxGeometry relative="1" as="geometry"/>
</mxCell>
<mxCell id="27" value="" style="outlineConnect=0;fontColor=#232F3E;gradientColor=#4D72F3;gradientDirection=north;fillColor=#3334B9;strokeColor=#ffffff;dashed=0;verticalLabelPosition=bottom;verticalAlign=top;align=center;html=1;fontSize=17;fontStyle=0;aspect=fixed;shape=shape_4;resIcon=service_3;" vertex="1" parent="1">
<mxGeometry x="473" y="640" width="64" height="64" as="geometry"/>
</mxCell>
<mxCell id="28" value="" style="outlineConnect=0;fontColor=#232F3E;gradientColor=#4D72F3;gradientDirection=north;fillColor=#3334B9;strokeColor=#ffffff;dashed=0;verticalLabelPosition=bottom;verticalAlign=top;align=center;html=1;fontSize=17;fontStyle=0;aspect=fixed;shape=shape_4;resIcon=service_2;" vertex="1" parent="1">
<mxGeometry x="885.2899999999998" y="639" width="64" height="64" as="geometry"/>
</mxCell>
<mxCell id="29" value="Primary<br style="font-size: 17px;">(Multi-area)" style="text;html=1;resizable=0;autosize=1;align=center;verticalAlign=middle;points=[];fillColor=none;strokeColor=none;rounded=0;fontSize=17;" vertex="1" parent="1">
<mxGeometry x="458" y="701" width="90" height="50" as="geometry"/>
</mxCell>
<mxCell id="30" value="" style="outlineConnect=0;fontColor=#232F3E;gradientColor=none;fillColor=#D86613;strokeColor=none;dashed=0;verticalLabelPosition=bottom;verticalAlign=top;align=center;html=1;fontSize=17;fontStyle=0;aspect=fixed;pointerEvents=1;shape=shape_3;" vertex="1" parent="1">
<mxGeometry x="385" y="380" width="68" height="68" as="geometry"/>
</mxCell>
<mxCell id="31" value="" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;endArrow=classic;endFill=1;fontSize=17;" edge="1" source="32" target="12" parent="1">
<mxGeometry relative="1" as="geometry">
<Array as="points">
<mxPoint x="503" y="550"/>
<mxPoint x="1160" y="550"/>
</Array>
</mxGeometry>
</mxCell>
<mxCell id="32" value="<font style="font-size: 17px">web component<br style="font-size: 17px">(<b>Service name 1<br>WordPress Instance</b>)</font>" style="text;html=1;resizable=0;autosize=1;align=center;verticalAlign=middle;points=[];fillColor=none;strokeColor=none;rounded=0;fontSize=17;" vertex="1" parent="1">
<mxGeometry x="413" y="452" width="180" height="70" as="geometry"/>
</mxCell>
</root>
</mxGraphModel>
</diagram>
</mxfile>
指标:
箭头 - “edgeStyle=orthogonalEdgeStyle”
对象(服务)-“resIcon=service_2”/service_1 等
到目前为止我做了什么 -
使用 xml.etree.ElementTree
我在循环中提取标签和属性,使用我想要的关键字提取这些值。'
结果存储在数组中。
id = []
attrb = []
objects_found = []
arrows_found = []
最后我想转换成dict对象-
{
id: '1',
attrb: 'service'
object: 'service_a'
arrows: true
arrow_start: {coordinate}
arrow_end: {coordinate}
}
如果没有箭头:
{
id: '1',
attrb: 'service'
object: 'service_a'
}
我的代码:
for item in tree.iter():
if item.tag == 'mxCell':
id = item.attrib['id']
# to split the long list of words with ';' in 'style' key. Major info is in there.
style_list = item.attrib['style'].split(';')
for style in style_list:
if '=' in style:
style_key = style.split('=')[0]
style_value = style.split('=')[1]
if style_key == 'shape' and style_value != 'icon' and 'keyword-a' in style_value:
service_icon = style_value
id.append(id)
attrb.append("service_name")
objects_found.append(service_icon)
elif style_key == 'resIcon':
service_icon = style_value
id.append(id)
attrb.append("service_name")
objects_found.append(service_icon)
elif style_key == 'edgeStyle':
arrow_style = style_value
id.append(id)
attrb.append("arrows")
arrows_found.append(arrow_style)
我试过
- 字典(zip))。但挑战在于可能有一些可选键在某些 ID 中不存在。
- pandas 数据框(并不理想,因为我打算听写)但我尝试使用 csv 获取 table 形式,数组也无法正常工作,因为我得到的数组值已经丢失ids和key-values之间的关系通过将它们放入数组中来识别,并且不同长度的数组不能一起放入数据帧中。
对任何解决方案有什么好的建议吗?
终于找到简单易行的方法
根据当前的逻辑,我可以使用关键字提取数据,并且对于每个已识别的键值,我将追加到嵌套字典中。
即
dictObj = {}
并在 for 循环中,开始为每个 id 在其中启动嵌套字典 - dictObj[id] = {}
确定每个键后,继续 dictObj[id].update({'key': value})
不确定这是否是最有效的方法,但至少我得到了我想要的输出。如果有人有更好的方法,请分享。
针对 XML Dict 提出了很多解决方案,但我无法解决我的特定用例。
我的XML格式是有多个相同的标签,但每个标签内可能有很多键值,而且并非所有标签都有一致数量的键值。这使它具有挑战性。
例如
<?xml version="1.0" encoding="UTF-8"?>
<mxfile host="xxx.xxx.com" modified="2021-06-14T07:52:04.437Z" agent="xxx" version="12.4.8" etag="o-cccc" type="device">
<diagram id="asdfsdf">
<mxGraphModel dx="1213" dy="2767" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="827" pageHeight="1169" math="0" shadow="0">
<root>
<mxCell id="0"/>
<mxCell id="1" parent="0"/>
<mxCell id="2" value="label_1" style="points=[[0,0],[0.25,0],[0.5,0],[0.75,0],[1,0],[1,0.25],[1,0.5],[1,0.75],[1,1],[0.75,1],[0.5,1],[0.25,1],[0,1],[0,0.75],[0,0.5],[0,0.25]];outlineConnect=0;gradientColor=none;html=1;whiteSpace=wrap;fontSize=17;fontStyle=0;shape=shape_1;grIcon=icon_1;strokeColor=#232F3E;fillColor=none;verticalAlign=top;align=left;spacingLeft=30;fontColor=#232F3E;dashed=0;" vertex="1" parent="1">
<mxGeometry x="110" y="-50" width="1170" height="840" as="geometry"/>
</mxCell>
<mxCell id="3" value="Region" style="points=[[0,0],[0.25,0],[0.5,0],[0.75,0],[1,0],[1,0.25],[1,0.5],[1,0.75],[1,1],[0.75,1],[0.5,1],[0.25,1],[0,1],[0,0.75],[0,0.5],[0,0.25]];outlineConnect=0;gradientColor=none;html=1;whiteSpace=wrap;fontSize=17;fontStyle=0;shape=shape_1;grIcon=icon_2;strokeColor=#147EBA;fillColor=none;verticalAlign=top;align=left;spacingLeft=30;fontColor=#147EBA;dashed=0;" vertex="1" parent="1">
<mxGeometry x="290" y="190" width="960" height="580" as="geometry"/>
</mxCell>
<mxCell id="4" value="Area 1" style="fillColor=none;strokeColor=#147EBA;dashed=1;verticalAlign=top;fontStyle=0;fontColor=#147EBA;fontSize=17;" vertex="1" parent="1">
<mxGeometry x="750" y="340" width="320" height="420" as="geometry"/>
</mxCell>
<mxCell id="5" value="Area 1" style="fillColor=none;strokeColor=#147EBA;dashed=1;verticalAlign=top;fontStyle=0;fontColor=#147EBA;fontSize=17;" vertex="1" parent="1">
<mxGeometry x="326" y="340" width="364" height="420" as="geometry"/>
</mxCell>
<mxCell id="6" value="" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;fontSize=17;" edge="1" source="7" target="9" parent="1">
<mxGeometry relative="1" as="geometry"/>
</mxCell>
<mxCell id="7" value="" style="outlineConnect=0;fontColor=#232F3E;gradientColor=none;fillColor=#232F3E;strokeColor=none;dashed=0;verticalLabelPosition=bottom;verticalAlign=top;align=center;html=1;fontSize=17;fontStyle=0;aspect=fixed;pointerEvents=1;shape=shape_3;" vertex="1" parent="1">
<mxGeometry x="698.43" y="-110" width="34" height="34" as="geometry"/>
</mxCell>
<mxCell id="8" value="" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;fontSize=17;" edge="1" source="9" target="35" parent="1">
<mxGeometry relative="1" as="geometry"/>
</mxCell>
<mxCell id="9" value="" style="outlineConnect=0;fontColor=#232F3E;gradientColor=#945DF2;gradientDirection=north;fillColor=#5A30B5;strokeColor=#ffffff;dashed=0;verticalLabelPosition=bottom;verticalAlign=top;align=center;html=1;fontSize=17;fontStyle=0;aspect=fixed;shape=shape_1;resIcon=service_2;" vertex="1" parent="1">
<mxGeometry x="675.43" y="-30" width="80" height="80" as="geometry"/>
</mxCell>
<mxCell id="24" value="<font style="font-size: 15px">Service name 1</font>" style="outlineConnect=0;fontColor=#232F3E;gradientColor=none;strokeColor=#ffffff;fillColor=#232F3E;dashed=0;verticalLabelPosition=middle;verticalAlign=bottom;align=center;html=1;whiteSpace=wrap;fontSize=17;fontStyle=1;spacing=3;shape=shape_2;prIcon=service_2;" vertex="1" parent="1">
<mxGeometry x="159" width="62" height="100" as="geometry"/>
</mxCell>
<mxCell id="25" value="" style="outlineConnect=0;fontColor=#232F3E;gradientColor=none;fillColor=#D86613;strokeColor=none;dashed=0;verticalLabelPosition=bottom;verticalAlign=top;align=center;html=1;fontSize=17;fontStyle=0;aspect=fixed;pointerEvents=1;shape=shape_3;" vertex="1" parent="1">
<mxGeometry x="817.1399999999999" y="383" width="64" height="64" as="geometry"/>
</mxCell>
<mxCell id="26" value="" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;fontSize=17;" edge="1" source="27" target="28" parent="1">
<mxGeometry relative="1" as="geometry"/>
</mxCell>
<mxCell id="27" value="" style="outlineConnect=0;fontColor=#232F3E;gradientColor=#4D72F3;gradientDirection=north;fillColor=#3334B9;strokeColor=#ffffff;dashed=0;verticalLabelPosition=bottom;verticalAlign=top;align=center;html=1;fontSize=17;fontStyle=0;aspect=fixed;shape=shape_4;resIcon=service_3;" vertex="1" parent="1">
<mxGeometry x="473" y="640" width="64" height="64" as="geometry"/>
</mxCell>
<mxCell id="28" value="" style="outlineConnect=0;fontColor=#232F3E;gradientColor=#4D72F3;gradientDirection=north;fillColor=#3334B9;strokeColor=#ffffff;dashed=0;verticalLabelPosition=bottom;verticalAlign=top;align=center;html=1;fontSize=17;fontStyle=0;aspect=fixed;shape=shape_4;resIcon=service_2;" vertex="1" parent="1">
<mxGeometry x="885.2899999999998" y="639" width="64" height="64" as="geometry"/>
</mxCell>
<mxCell id="29" value="Primary<br style="font-size: 17px;">(Multi-area)" style="text;html=1;resizable=0;autosize=1;align=center;verticalAlign=middle;points=[];fillColor=none;strokeColor=none;rounded=0;fontSize=17;" vertex="1" parent="1">
<mxGeometry x="458" y="701" width="90" height="50" as="geometry"/>
</mxCell>
<mxCell id="30" value="" style="outlineConnect=0;fontColor=#232F3E;gradientColor=none;fillColor=#D86613;strokeColor=none;dashed=0;verticalLabelPosition=bottom;verticalAlign=top;align=center;html=1;fontSize=17;fontStyle=0;aspect=fixed;pointerEvents=1;shape=shape_3;" vertex="1" parent="1">
<mxGeometry x="385" y="380" width="68" height="68" as="geometry"/>
</mxCell>
<mxCell id="31" value="" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;endArrow=classic;endFill=1;fontSize=17;" edge="1" source="32" target="12" parent="1">
<mxGeometry relative="1" as="geometry">
<Array as="points">
<mxPoint x="503" y="550"/>
<mxPoint x="1160" y="550"/>
</Array>
</mxGeometry>
</mxCell>
<mxCell id="32" value="<font style="font-size: 17px">web component<br style="font-size: 17px">(<b>Service name 1<br>WordPress Instance</b>)</font>" style="text;html=1;resizable=0;autosize=1;align=center;verticalAlign=middle;points=[];fillColor=none;strokeColor=none;rounded=0;fontSize=17;" vertex="1" parent="1">
<mxGeometry x="413" y="452" width="180" height="70" as="geometry"/>
</mxCell>
</root>
</mxGraphModel>
</diagram>
</mxfile>
指标: 箭头 - “edgeStyle=orthogonalEdgeStyle” 对象(服务)-“resIcon=service_2”/service_1 等
到目前为止我做了什么 -
使用 xml.etree.ElementTree
我在循环中提取标签和属性,使用我想要的关键字提取这些值。'
结果存储在数组中。
id = []
attrb = []
objects_found = []
arrows_found = []
最后我想转换成dict对象-
{
id: '1',
attrb: 'service'
object: 'service_a'
arrows: true
arrow_start: {coordinate}
arrow_end: {coordinate}
}
如果没有箭头:
{
id: '1',
attrb: 'service'
object: 'service_a'
}
我的代码:
for item in tree.iter():
if item.tag == 'mxCell':
id = item.attrib['id']
# to split the long list of words with ';' in 'style' key. Major info is in there.
style_list = item.attrib['style'].split(';')
for style in style_list:
if '=' in style:
style_key = style.split('=')[0]
style_value = style.split('=')[1]
if style_key == 'shape' and style_value != 'icon' and 'keyword-a' in style_value:
service_icon = style_value
id.append(id)
attrb.append("service_name")
objects_found.append(service_icon)
elif style_key == 'resIcon':
service_icon = style_value
id.append(id)
attrb.append("service_name")
objects_found.append(service_icon)
elif style_key == 'edgeStyle':
arrow_style = style_value
id.append(id)
attrb.append("arrows")
arrows_found.append(arrow_style)
我试过
- 字典(zip))。但挑战在于可能有一些可选键在某些 ID 中不存在。
- pandas 数据框(并不理想,因为我打算听写)但我尝试使用 csv 获取 table 形式,数组也无法正常工作,因为我得到的数组值已经丢失ids和key-values之间的关系通过将它们放入数组中来识别,并且不同长度的数组不能一起放入数据帧中。
对任何解决方案有什么好的建议吗?
终于找到简单易行的方法
根据当前的逻辑,我可以使用关键字提取数据,并且对于每个已识别的键值,我将追加到嵌套字典中。
即
dictObj = {}
并在 for 循环中,开始为每个 id 在其中启动嵌套字典 - dictObj[id] = {}
确定每个键后,继续 dictObj[id].update({'key': value})
不确定这是否是最有效的方法,但至少我得到了我想要的输出。如果有人有更好的方法,请分享。