使用 Python 验证 XML 节点结构
Validating XML node structure with Python
我有文件:
<?xml version='1.0' encoding='UTF-8'?>
<AUTOSAR xmlns="http://autosar.org/schema/r4.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://autosar.org/schema/r4.0 AUTOSAR_4-2-2_STRICT_COMPACT.xsd">
<AR-PACKAGES>
<AR-PACKAGE>
<SHORT-NAME>RootP_Composition</SHORT-NAME>
<COMPOSITION-SW-COMPONENT-TYPE>
<SHORT-NAME>Compo_VSM</SHORT-NAME>
<CONNECTORS>
<ASSEMBLY-SW-CONNECTOR>
<SHORT-NAME>PP_CS_VehicleSPeed_ASWC_M6_to_ASWC_M740</SHORT-NAME>
<PROVIDER-IREF>
<CONTEXT-COMPONENT-REF DEST="SW-COMPONENT-PROTOTYPE">/RootP_Composition/Compo_VSM/Instance_ASWC_M6</CONTEXT-COMPONENT-REF>
<TARGET-P-PORT-REF DEST="P-PORT-PROTOTYOPE">/RootP_ASWC_M6/ASWC_M6/PP_CS_VehicleSPeed</TARGET-P-PORT-REF>
</PROVIDER-IREF>
<REQUESTER-IREF>
<CONTEXT-COMPONENT-REF DEST="SW-COMPONENT-PROTOTYPE">/RootP_Composition/Compo_VSM/Instance_ASWC_M740</CONTEXT-COMPONENT-REF>
<TARGET-R-PORT-REF DEST="R-PORT-PROTOTYOPE">/RootP_ASWC_M740/ASWC_M740/RP_CS_VehicleSPeed</TARGET-R-PORT-REF>
</REQUESTER-IREF>
</ASSEMBLY-SW-CONNECTOR>
</CONNECTORS>
</COMPOSITION-SW-COMPONENT-TYPE>
</AR-PACKAGE>
</AR-PACKAGES>
</AUTOSAR>
我想检查 ASSEMBLY-SW-CONNECTOR
节点是否有子节点 SHORT-NAME
、PROVIDER-IREF
、REQUESTER-IREF
以及 PROVIDER-IREF/REQUESTER-IREF
是否有子节点 ( ASSEMBLY-SW-CONNECTOR
) CONTEXT-COMPONENT-REF
和 TARGET-P-PORT-REF/CONTEXT-COMPONENT-REF
和 TARGET-R-PORT-REF
的孙子
到目前为止我有这个代码:
tree = ET.parse('C:\test\Abu\TRS.ABU.GEN.002\output\Connectors.arxml')
root = tree.getroot()
child = ["SHORT-NAME", "PROVIDER-IREF", "REQUESTER-IREF"]
grandchild = ["CONTEXT-COMPONENT-REF", "TARGET-P-PORT-REF", "CONTEXT-COMPONENT-REF", "TARGET-R-PORT-REF"]
connector = '{http://autosar.org/schema/r4.0}ASSEMBLY-SW-CONNECTOR'
for element in root.iter(tag = connector):
for child in element:
for grandchild in child:
if child.tag.split('}', 1)[1] in child:
if grandchild.tag.split('}', 1)[1] in grandchild:
print("yes")
else:
print("No")
我哪里错了?提前致谢!
更新 1
tree = etree.parse('C:\test\Abu\TRS.ABU.GEN.002\output\Connectors.arxml')
root = tree.getroot()
found_name = found_provider = found_requester = found_contextP = found_targetP = found_contextR =found_targetR = False
connectors = root.findall(".//{http://autosar.org/schema/r4.0}ASSEMBLY-SW-CONNECTOR>")
for elem in connectors:
if elem.find(".//{http://autosar.org/schema/r4.0}SHORT-NAME>"):
found_name = True
if elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF>"):
found_provider = True
for child in elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF>"):
if child.find(".//{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF>"):
found_contextR = True
if child.find(".//{http://autosar.org/schema/r4.0}TARGET-P-PORT-REF>"):
found_targetP = True
if elem.find(".//{http://autosar.org/schema/r4.0}REQUESTER-IREF>"):
found_requester = True
for child in elem.find(".//{http://autosar.org/schema/r4.0}REQUESTER-IREF>"):
if child.find(".//{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF>"):
found_contextR = True
if child.find(".//{http://autosar.org/schema/r4.0}TARGET-R-PORT-REF>"):
found_targetR = True
if found_name and found_provider and found_requester and found_contextP and found_targetP and found_contextR and found_targetR:
print("True")
else:
print("False")
知道为什么我得到错误的结果吗?
两期:
首先,一些typos/small错误:
- 您的所有查找路径中都有一个不必要的结束标记 (
>
),因此它们都需要删除
- 在你的
found_provider
部分,当我认为你的意思是 found_contextP
时你设置了 found_contextR
(P,而不是 R)
使用
if elem.find("<path>"):
引发警告,您应该改用
if elem.find("<path>") is not None:
其次,您的 child
元素部分有误,例如 found_provider
部分:
if elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF>"):
found_provider = True
for child in elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF>"):
if child.find(".//{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF>"):
found_contextR = True
if child.find(".//{http://autosar.org/schema/r4.0}TARGET-P-PORT-REF>"):
found_targetP = True
您正确地找到了 PROVIDER-IREF
元素,然后您遍历它的 children 试图匹配 CONTEXT-COMPONENT-REF
和 TARGET-P-PORT-REF
元素。但是你通过搜索它们作为这些 child 元素的 children 来做到这一点(即 PROVIDER-IREF
的 grandchildren),当它们本身 是child人。
所以要么你需要检查 child 元素的标签,而不是搜索它们下面的元素:
if elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF") is not None:
found_provider = True
for child in elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF"):
if child.tag == "{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF":
found_contextP = True
if child.tag == "{http://autosar.org/schema/r4.0}TARGET-P-PORT-REF":
found_targetP = True
或者您可以尝试提取 PROVIDER-IREF
元素,然后在其下查找元素:
provider = elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF")
if provider is not None:
found_provider = True
if provider.find("{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF") is not None:
found_contextP = True
if provider.find("{http://autosar.org/schema/r4.0}TARGET-P-PORT-REF") is not None:
found_targetP = True
显然,然后对 found_requester
部分执行类似操作。
我觉得你最初的做法其实很好;尝试指定一个 child-grandchild 结构,然后检查它是否适合 XML。但是你需要指定哪个 grandchildren 属于哪个 children,所以可以像这样使用嵌套字典:
structure = {
"ASSEMBLY-SW-CONNECTOR": {
"SHORT-NAME": None,
"PROVIDER-IREF": {
"CONTEXT-COMPONENT-REF": None,
"TARGET-P-PORT-REF": None
}
"REQUESTER-IREF": {
"CONTEXT-COMPONENT-REF": None,
"TARGET-R-PORT-REF": None
}
}
}
然后有一个递归函数(即调用自身的函数)来搜索匹配的 children 直到它到达 None
并停止向下查找该分支。
我有文件:
<?xml version='1.0' encoding='UTF-8'?>
<AUTOSAR xmlns="http://autosar.org/schema/r4.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://autosar.org/schema/r4.0 AUTOSAR_4-2-2_STRICT_COMPACT.xsd">
<AR-PACKAGES>
<AR-PACKAGE>
<SHORT-NAME>RootP_Composition</SHORT-NAME>
<COMPOSITION-SW-COMPONENT-TYPE>
<SHORT-NAME>Compo_VSM</SHORT-NAME>
<CONNECTORS>
<ASSEMBLY-SW-CONNECTOR>
<SHORT-NAME>PP_CS_VehicleSPeed_ASWC_M6_to_ASWC_M740</SHORT-NAME>
<PROVIDER-IREF>
<CONTEXT-COMPONENT-REF DEST="SW-COMPONENT-PROTOTYPE">/RootP_Composition/Compo_VSM/Instance_ASWC_M6</CONTEXT-COMPONENT-REF>
<TARGET-P-PORT-REF DEST="P-PORT-PROTOTYOPE">/RootP_ASWC_M6/ASWC_M6/PP_CS_VehicleSPeed</TARGET-P-PORT-REF>
</PROVIDER-IREF>
<REQUESTER-IREF>
<CONTEXT-COMPONENT-REF DEST="SW-COMPONENT-PROTOTYPE">/RootP_Composition/Compo_VSM/Instance_ASWC_M740</CONTEXT-COMPONENT-REF>
<TARGET-R-PORT-REF DEST="R-PORT-PROTOTYOPE">/RootP_ASWC_M740/ASWC_M740/RP_CS_VehicleSPeed</TARGET-R-PORT-REF>
</REQUESTER-IREF>
</ASSEMBLY-SW-CONNECTOR>
</CONNECTORS>
</COMPOSITION-SW-COMPONENT-TYPE>
</AR-PACKAGE>
</AR-PACKAGES>
</AUTOSAR>
我想检查 ASSEMBLY-SW-CONNECTOR
节点是否有子节点 SHORT-NAME
、PROVIDER-IREF
、REQUESTER-IREF
以及 PROVIDER-IREF/REQUESTER-IREF
是否有子节点 ( ASSEMBLY-SW-CONNECTOR
) CONTEXT-COMPONENT-REF
和 TARGET-P-PORT-REF/CONTEXT-COMPONENT-REF
和 TARGET-R-PORT-REF
到目前为止我有这个代码:
tree = ET.parse('C:\test\Abu\TRS.ABU.GEN.002\output\Connectors.arxml')
root = tree.getroot()
child = ["SHORT-NAME", "PROVIDER-IREF", "REQUESTER-IREF"]
grandchild = ["CONTEXT-COMPONENT-REF", "TARGET-P-PORT-REF", "CONTEXT-COMPONENT-REF", "TARGET-R-PORT-REF"]
connector = '{http://autosar.org/schema/r4.0}ASSEMBLY-SW-CONNECTOR'
for element in root.iter(tag = connector):
for child in element:
for grandchild in child:
if child.tag.split('}', 1)[1] in child:
if grandchild.tag.split('}', 1)[1] in grandchild:
print("yes")
else:
print("No")
我哪里错了?提前致谢!
更新 1
tree = etree.parse('C:\test\Abu\TRS.ABU.GEN.002\output\Connectors.arxml')
root = tree.getroot()
found_name = found_provider = found_requester = found_contextP = found_targetP = found_contextR =found_targetR = False
connectors = root.findall(".//{http://autosar.org/schema/r4.0}ASSEMBLY-SW-CONNECTOR>")
for elem in connectors:
if elem.find(".//{http://autosar.org/schema/r4.0}SHORT-NAME>"):
found_name = True
if elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF>"):
found_provider = True
for child in elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF>"):
if child.find(".//{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF>"):
found_contextR = True
if child.find(".//{http://autosar.org/schema/r4.0}TARGET-P-PORT-REF>"):
found_targetP = True
if elem.find(".//{http://autosar.org/schema/r4.0}REQUESTER-IREF>"):
found_requester = True
for child in elem.find(".//{http://autosar.org/schema/r4.0}REQUESTER-IREF>"):
if child.find(".//{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF>"):
found_contextR = True
if child.find(".//{http://autosar.org/schema/r4.0}TARGET-R-PORT-REF>"):
found_targetR = True
if found_name and found_provider and found_requester and found_contextP and found_targetP and found_contextR and found_targetR:
print("True")
else:
print("False")
知道为什么我得到错误的结果吗?
两期:
首先,一些typos/small错误:
- 您的所有查找路径中都有一个不必要的结束标记 (
>
),因此它们都需要删除 - 在你的
found_provider
部分,当我认为你的意思是found_contextP
时你设置了found_contextR
(P,而不是 R) 使用
if elem.find("<path>"):
引发警告,您应该改用
if elem.find("<path>") is not None:
其次,您的 child
元素部分有误,例如 found_provider
部分:
if elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF>"):
found_provider = True
for child in elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF>"):
if child.find(".//{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF>"):
found_contextR = True
if child.find(".//{http://autosar.org/schema/r4.0}TARGET-P-PORT-REF>"):
found_targetP = True
您正确地找到了 PROVIDER-IREF
元素,然后您遍历它的 children 试图匹配 CONTEXT-COMPONENT-REF
和 TARGET-P-PORT-REF
元素。但是你通过搜索它们作为这些 child 元素的 children 来做到这一点(即 PROVIDER-IREF
的 grandchildren),当它们本身 是child人。
所以要么你需要检查 child 元素的标签,而不是搜索它们下面的元素:
if elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF") is not None:
found_provider = True
for child in elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF"):
if child.tag == "{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF":
found_contextP = True
if child.tag == "{http://autosar.org/schema/r4.0}TARGET-P-PORT-REF":
found_targetP = True
或者您可以尝试提取 PROVIDER-IREF
元素,然后在其下查找元素:
provider = elem.find(".//{http://autosar.org/schema/r4.0}PROVIDER-IREF")
if provider is not None:
found_provider = True
if provider.find("{http://autosar.org/schema/r4.0}CONTEXT-COMPONENT-REF") is not None:
found_contextP = True
if provider.find("{http://autosar.org/schema/r4.0}TARGET-P-PORT-REF") is not None:
found_targetP = True
显然,然后对 found_requester
部分执行类似操作。
我觉得你最初的做法其实很好;尝试指定一个 child-grandchild 结构,然后检查它是否适合 XML。但是你需要指定哪个 grandchildren 属于哪个 children,所以可以像这样使用嵌套字典:
structure = {
"ASSEMBLY-SW-CONNECTOR": {
"SHORT-NAME": None,
"PROVIDER-IREF": {
"CONTEXT-COMPONENT-REF": None,
"TARGET-P-PORT-REF": None
}
"REQUESTER-IREF": {
"CONTEXT-COMPONENT-REF": None,
"TARGET-R-PORT-REF": None
}
}
}
然后有一个递归函数(即调用自身的函数)来搜索匹配的 children 直到它到达 None
并停止向下查找该分支。