如何使用 minidom 访问元素的子元素?
How to access the child elements of an element with minidom?
我正在读取 Jupyter notebook 中的 XML/OWL 文件(从 Protege 生成)。
我可以阅读根元素,但对于儿童来说它显示 error/blank。
from xml.dom.minidom import parse
DOMTree = parse("pressman.owl")
collection = DOMTree.documentElement
if collection.hasAttribute("shelf"):
print("Root element : %s" % collection.getAttribute("owl:ObjectProperty"))
for objectprop in collection.getElementsByTagName("owl:ObjectProperty"):
if objectprop.hasAttribute("rdf:about"):
propertytext = objectprop.getAttribute("rdf:about")
property = propertytext.split('#',2)
print ("Property: %s" % property[1])
type = objectprop.getElementsByTagName('rdf:resource')
print ("Type: %s" % type)
和pressman.owl
文件(删节):
<rdf:RDF xmlns="http://www.semanticweb.org/sraza/ontologies/2021/4/untitled-ontology-6#"
xml:base="http://www.semanticweb.org/sraza/ontologies/2021/4/untitled-ontology-6"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:xml="http://www.w3.org/XML/1998/namespace"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:PressmanOntology="urn:absolute:PressmanOntology#"
xmlns:UniversityOntology="http://www.semanticweb.org/sraza/ontologies/2021/4/UniversityOntology#">
<owl:Ontology rdf:about="urn:absolute:PressmanOntology"/>
<!-- Object Properties -->
<owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasAdvice"/>
<owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDefinition">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
<rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
<rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
</owl:ObjectProperty>
<owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDiagram">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
<rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
<rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
</owl:ObjectProperty>
<!-- more entries... -->
</rdf:RDF>
输出fis
Property: hasAdvice
Type: []
Property: hasDefinition
Type: []
Property: hasDiagram
Type: []
你有这个结构
<owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDefinition">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
<rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
<rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
</owl:ObjectProperty>
并且您正在使用
type = objectprop.getElementsByTagName('rdf:resource')
这行不通,因为 rdf:resource
不是元素,而是属性。我假设您感兴趣的那个属于 <rdf:type>
。所以我们需要再往下一层:
rdf_type = objectprop.getElementsByTagName('rdf:type')
现在rdf_type
是一个节点列表-毕竟调用了方法"get elements by tag name" , 并且 minidom 不知道在你的情况下可能只有一个 <rdf:type>
。我们取第一个,如果它存在:
rdf_type = rdf_type[0] if len(rdf_type) > 0 else None
现在 rdf:resource
是该元素的属性。在 minidom 中通过 .getAttribute()
访问属性。
理论上,XML 中可能缺少 rdf:resource
属性,因此在使用它之前,请确保它存在:
if rdf_type is not None and rdf_type.hasAttribute('rdf:resource'):
rdf_resource = rdf_type.getAttribute('rdf:resource')
else:
rdf_resource = None
print(rdf_resource)
综上所述,与其手动处理 RDF 文件,还不如查看为 RDF 编写的库,例如 rdflib, or even for OWL specifically, such as pyLODE。
我正在读取 Jupyter notebook 中的 XML/OWL 文件(从 Protege 生成)。
我可以阅读根元素,但对于儿童来说它显示 error/blank。
from xml.dom.minidom import parse
DOMTree = parse("pressman.owl")
collection = DOMTree.documentElement
if collection.hasAttribute("shelf"):
print("Root element : %s" % collection.getAttribute("owl:ObjectProperty"))
for objectprop in collection.getElementsByTagName("owl:ObjectProperty"):
if objectprop.hasAttribute("rdf:about"):
propertytext = objectprop.getAttribute("rdf:about")
property = propertytext.split('#',2)
print ("Property: %s" % property[1])
type = objectprop.getElementsByTagName('rdf:resource')
print ("Type: %s" % type)
和pressman.owl
文件(删节):
<rdf:RDF xmlns="http://www.semanticweb.org/sraza/ontologies/2021/4/untitled-ontology-6#"
xml:base="http://www.semanticweb.org/sraza/ontologies/2021/4/untitled-ontology-6"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:xml="http://www.w3.org/XML/1998/namespace"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:PressmanOntology="urn:absolute:PressmanOntology#"
xmlns:UniversityOntology="http://www.semanticweb.org/sraza/ontologies/2021/4/UniversityOntology#">
<owl:Ontology rdf:about="urn:absolute:PressmanOntology"/>
<!-- Object Properties -->
<owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasAdvice"/>
<owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDefinition">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
<rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
<rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
</owl:ObjectProperty>
<owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDiagram">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
<rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
<rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
</owl:ObjectProperty>
<!-- more entries... -->
</rdf:RDF>
输出fis
Property: hasAdvice Type: [] Property: hasDefinition Type: [] Property: hasDiagram Type: []
你有这个结构
<owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDefinition">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
<rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
<rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
</owl:ObjectProperty>
并且您正在使用
type = objectprop.getElementsByTagName('rdf:resource')
这行不通,因为 rdf:resource
不是元素,而是属性。我假设您感兴趣的那个属于 <rdf:type>
。所以我们需要再往下一层:
rdf_type = objectprop.getElementsByTagName('rdf:type')
现在rdf_type
是一个节点列表-毕竟调用了方法"get elements by tag name" , 并且 minidom 不知道在你的情况下可能只有一个 <rdf:type>
。我们取第一个,如果它存在:
rdf_type = rdf_type[0] if len(rdf_type) > 0 else None
现在 rdf:resource
是该元素的属性。在 minidom 中通过 .getAttribute()
访问属性。
理论上,XML 中可能缺少 rdf:resource
属性,因此在使用它之前,请确保它存在:
if rdf_type is not None and rdf_type.hasAttribute('rdf:resource'):
rdf_resource = rdf_type.getAttribute('rdf:resource')
else:
rdf_resource = None
print(rdf_resource)
综上所述,与其手动处理 RDF 文件,还不如查看为 RDF 编写的库,例如 rdflib, or even for OWL specifically, such as pyLODE。