正在 python 中解析 XML 并删除容器

Parsing XML in python and deletion of containers

我正在尝试编写一个 Python 脚本来遍历文件并删除特定节点属性的容器。例如,我的树看起来像:

<collection shelf="New Arrivals">
  <ECUC-NUMERICAL-PARAM-VALUE>
    <SHORT-NAME>RTE_ABC</SHORT-NAME>
    <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF">/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/ComIPduCancellationSupport</DEFINITION-REF>
  </ECUC-NUMERICAL-PARAM-VALUE>
  <ECUC-NUMERICAL-PARAM-VALUE>
    <SHORT-NAME>RTE_ABC</SHORT-NAME>
    <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF">/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/xyz</DEFINITION-REF>
  </ECUC-NUMERICAL-PARAM-VALUE>
  <ECUC-NUMERICAL-PARAM-VALUE>
    <SHORT-NAME>RTE_ABC</SHORT-NAME>
    <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF">/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/ComIPduCancellationSupport</DEFINITION-REF>
  </ECUC-NUMERICAL-PARAM-VALUE>
  <ECUC-NUMERICAL-PARAM-VALUE>
    <SHORT-NAME>RTE_ABC</SHORT-NAME>
    <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF">/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/xyz</DEFINITION-REF>
  </ECUC-NUMERICAL-PARAM-VALUE>
  <ECUC-NUMERICAL-PARAM-VALUE>
    <SHORT-NAME>RTE_ABC</SHORT-NAME>
    <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF">/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/ComIPduCancellationSupport</DEFINITION-REF>
  </ECUC-NUMERICAL-PARAM-VALUE>
</collection>

Q1

如果子节点 <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF"> 的属性等于:/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/ComIPduCancellationSupport

,则应删除整个容器

我写的脚本是:

import xml.etree.ElementTree as ET
tree = ET.parse('autosar1.xml')
root = tree.getroot()
for child in root.findall(".//ECUC-NUMERICAL-PARAM-VALUE"):
    for z in child.findall(".//DEFINITION-REF[@DEST='ECUC-BOOLEAN-PARAM-DEF']"):
        if z.text == "/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/ComIPduCancellationSupport":
            child.remove(z)         
tree.write('output.xml')

但我没有得到预期的结果。 我得到的结果是:

<collection shelf="New Arrivals">
<ECUC-NUMERICAL-PARAM-VALUE>
<SHORT-NAME>RTE_ABC</SHORT-NAME>
</ECUC-NUMERICAL-PARAM-VALUE>

<ECUC-NUMERICAL-PARAM-VALUE>
<SHORT-NAME>RTE_ABC</SHORT-NAME>
</ECUC-NUMERICAL-PARAM-VALUE>

<ECUC-NUMERICAL-PARAM-VALUE>
<SHORT-NAME>RTE_ABC</SHORT-NAME>
</ECUC-NUMERICAL-PARAM-VALUE>

<ECUC-NUMERICAL-PARAM-VALUE>
<SHORT-NAME>RTE_ABC</SHORT-NAME>
</ECUC-NUMERICAL-PARAM-VALUE>

<ECUC-NUMERICAL-PARAM-VALUE>
<SHORT-NAME>RTE_ABC</SHORT-NAME>
</ECUC-NUMERICAL-PARAM-VALUE>
</collection>

我想要得到的结果:

<collection shelf="New Arrivals">
  <ECUC-NUMERICAL-PARAM-VALUE>
    <SHORT-NAME>RTE_ABC</SHORT-NAME>
    <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF">/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/xyz</DEFINITION-REF>
  </ECUC-NUMERICAL-PARAM-VALUE>
  <ECUC-NUMERICAL-PARAM-VALUE>
    <SHORT-NAME>RTE_ABC</SHORT-NAME>
    <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF">/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/xyz</DEFINITION-REF>
  </ECUC-NUMERICAL-PARAM-VALUE>
</collection>

Q2

而不是在 if 条件下对节点属性进行硬编码,是否有可能通过接受用户输入(可能在命令提示符中),假设为 "ComIPduCancellationSupport",(不是整个属性为 "/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/ComIPduCancellationSupport"), 达到预期的输出。

非常感谢。

考虑第三方,lxml, the most feature-rich and easy-to-use library for processing XML and HTML in the Python language. You can install with pip or binary file for Windows。推荐的原因是该模块可以 运行 完全符合 W3C 的 XPath 1.0 和 XSLT 1.0,后者的 XSLT 对您很有用。

XSLT is a special-purpose language that can transform XML files like removing nodes conditionally. Specifically in XSLT, we run the Identity Transform (to copy entire document as is) and then run an empty template on the node we intend to remove. Notice the use of contains() 检查该节点文本中任意位置的字符串。此方法不需要 for 循环或 if 逻辑。

并且使用 Python 的 lxml 我们可以从字符串构建一个动态 XSLT 脚本(顺便说一下 一个 XML 文件)并传递将 COMPU-METHOD-REF 之类的字符串转换为 contains()。这样的字符串可以来自用户输入。注意字符串 .format().

{0} 占位符

Python

import lxml.etree as et
doc = et.parse('Input.xml')

xsl_str='''<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                                         xmlns:doc="http://autosar.org/3.0.2">
  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>

  <!-- IDENTITY TRANSFORM -->
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <!-- EMPTY TEMPLATE -->
  <xsl:template match="INTEGER-TYPE[descendant::COMPU-METHOD-REF/@DEST='COMPU-METHOD' and 
                                    contains(descendant::COMPU-METHOD-REF, '{0}')]">    
  </xsl:template>

</xsl:stylesheet>'''

# LOAD DYNAMIC XSL STRING (PASSING BELOW STRING INTO ABOVE)
xsl = et.fromstring(xsl_str.format('CoolantTemp_T'))

transform = et.XSLT(xsl)
result = transform(doc)

# OUTPUT TO SCREEN
print(result)    
# OUTPUT TO FILE
with open('output.xml', 'wb') as f:
    f.write(result)

输出

<?xml version="1.0"?>
<TOP-LEVEL-PACKAGES>
  <AR-PACKAGE>
    <SHORT-NAME>DataType</SHORT-NAME>
    <ELEMENTS>
      <INTEGER-TYPE>
        <SHORT-NAME>EngineSpeed_T</SHORT-NAME>
        <SW-DATA-DEF-PROPS>
          <COMPU-METHOD-REF DEST="COMPU-METHOD">/DataType/DataTypeSemantics/EngineSpeed_T</COMPU-METHOD-REF>
        </SW-DATA-DEF-PROPS>
        <LOWER-LIMIT INTERVAL-TYPE="CLOSED">0</LOWER-LIMIT>
        <UPPER-LIMIT INTERVAL-TYPE="CLOSED">65535</UPPER-LIMIT>
      </INTEGER-TYPE>
      <INTEGER-TYPE>
        <SHORT-NAME>VehicleSpeed_T</SHORT-NAME>
        <SW-DATA-DEF-PROPS>
          <COMPU-METHOD-REF DEST="COMPU-METHOD">/DataType/DataTypeSemantics/VehicleSpeed_T</COMPU-METHOD-REF>
        </SW-DATA-DEF-PROPS>
        <LOWER-LIMIT INTERVAL-TYPE="CLOSED">0</LOWER-LIMIT>
        <UPPER-LIMIT INTERVAL-TYPE="CLOSED">65535</UPPER-LIMIT>
      </INTEGER-TYPE>
      <INTEGER-TYPE>
        <SHORT-NAME>Percent_T</SHORT-NAME>
        <SW-DATA-DEF-PROPS>
          <COMPU-METHOD-REF DEST="COMPU-METHOD">/DataType/DataTypeSemantics/Percent_T</COMPU-METHOD-REF>
        </SW-DATA-DEF-PROPS>
        <LOWER-LIMIT INTERVAL-TYPE="CLOSED">0</LOWER-LIMIT>
        <UPPER-LIMIT INTERVAL-TYPE="CLOSED">255</UPPER-LIMIT>
      </INTEGER-TYPE>
    </ELEMENTS>
    <SUB-PACKAGES>
      <AR-PACKAGE>
        <SHORT-NAME>DataTypeSemantics</SHORT-NAME>
        <ELEMENTS>
          <COMPU-METHOD>
            <SHORT-NAME>EngineSpeed_T</SHORT-NAME>
            <UNIT-REF DEST="UNIT">/DataType/DataTypeUnits/rpm</UNIT-REF>
            <COMPU-INTERNAL-TO-PHYS>
              <COMPU-SCALES>
                <COMPU-SCALE>
                  <COMPU-RATIONAL-COEFFS>
                    <COMPU-NUMERATOR>
                      <V>0</V>
                      <V>1</V>
                    </COMPU-NUMERATOR>
                    <COMPU-DENOMINATOR>
                      <V>8</V>
                    </COMPU-DENOMINATOR>
                  </COMPU-RATIONAL-COEFFS>
                </COMPU-SCALE>
              </COMPU-SCALES>
            </COMPU-INTERNAL-TO-PHYS>
          </COMPU-METHOD>
          <COMPU-METHOD>
            <SHORT-NAME>VehicleSpeed_T</SHORT-NAME>
            <UNIT-REF DEST="UNIT">/DataType/DataTypeUnits/kph</UNIT-REF>
            <COMPU-INTERNAL-TO-PHYS>
              <COMPU-SCALES>
                <COMPU-SCALE>
                  <COMPU-RATIONAL-COEFFS>
                    <COMPU-NUMERATOR>
                      <V>0</V>
                      <V>1</V>
                    </COMPU-NUMERATOR>
                    <COMPU-DENOMINATOR>
                      <V>64</V>
                    </COMPU-DENOMINATOR>
                  </COMPU-RATIONAL-COEFFS>
                </COMPU-SCALE>
              </COMPU-SCALES>
            </COMPU-INTERNAL-TO-PHYS>
          </COMPU-METHOD>
          <COMPU-METHOD>
            <SHORT-NAME>Percent_T</SHORT-NAME>
            <UNIT-REF DEST="UNIT">/DataType/DataTypeUnits/Percent</UNIT-REF>
            <COMPU-INTERNAL-TO-PHYS>
              <COMPU-SCALES>
                <COMPU-SCALE>
                  <COMPU-RATIONAL-COEFFS>
                    <COMPU-NUMERATOR>
                      <V>0</V>
                      <V>0.4</V>
                    </COMPU-NUMERATOR>
                    <COMPU-DENOMINATOR>
                      <V>1</V>
                    </COMPU-DENOMINATOR>
                  </COMPU-RATIONAL-COEFFS>
                </COMPU-SCALE>
              </COMPU-SCALES>
            </COMPU-INTERNAL-TO-PHYS>
          </COMPU-METHOD>
          <COMPU-METHOD>
            <SHORT-NAME>CoolantTemp_T</SHORT-NAME>
            <UNIT-REF DEST="UNIT">/DataType/DataTypeUnits/DegreeC</UNIT-REF>
            <COMPU-INTERNAL-TO-PHYS>
              <COMPU-SCALES>
                <COMPU-SCALE>
                  <COMPU-RATIONAL-COEFFS>
                    <COMPU-NUMERATOR>
                      <V>-40</V>
                      <V>1</V>
                    </COMPU-NUMERATOR>
                    <COMPU-DENOMINATOR>
                      <V>2</V>
                    </COMPU-DENOMINATOR>
                  </COMPU-RATIONAL-COEFFS>
                </COMPU-SCALE>
              </COMPU-SCALES>
            </COMPU-INTERNAL-TO-PHYS>
          </COMPU-METHOD>
        </ELEMENTS>
      </AR-PACKAGE>
      <AR-PACKAGE>
        <SHORT-NAME>DataTypeUnits</SHORT-NAME>
        <ELEMENTS>
          <UNIT>
            <SHORT-NAME>rpm</SHORT-NAME>
            <DISPLAY-NAME>rpm</DISPLAY-NAME>
          </UNIT>
          <UNIT>
            <SHORT-NAME>kph</SHORT-NAME>
            <DISPLAY-NAME>kph</DISPLAY-NAME>
          </UNIT>
          <UNIT>
            <SHORT-NAME>Percent</SHORT-NAME>
            <DISPLAY-NAME>Percent</DISPLAY-NAME>
          </UNIT>
          <UNIT>
            <SHORT-NAME>DegreeC</SHORT-NAME>
            <DISPLAY-NAME>DegreeC</DISPLAY-NAME>
          </UNIT>
        </ELEMENTS>
      </AR-PACKAGE>
    </SUB-PACKAGES>
  </AR-PACKAGE>
</TOP-LEVEL-PACKAGES>