根据子元素值删除父节点

remove parent node depending on the child element values

我有很多 XML 个文件,例如下面的示例输入文件。

我想得到的是去掉子元素节点中不包含banana值的<b>个节点

<a>
  <header>
    fruit
  </header>
  <b>
    <fruitlist>
      <d>banana</d>
    </fruitlist>
    <fruitlist>
      <d>apple</d>
    </fruitlist>
  </b>
  
  <b>
    <fruitlist>
      <d>lemon</d>
    </fruitlist>
    <fruitlist>
      <d>tomato</d>
    </fruitlist>
  </b>
  
  <b>
    <fruitlist>
      <d>banana</d>
    </fruitlist>
  </b>
  
  <b>
    <fruitlist>
      <d>lemon</d>
    </fruitlist>
    <fruitlist>
      <d>kiwi</d>
    </fruitlist>
  </b>
  
  <b>
    <fruitlist>
      <d>strawberry</d>
    </fruitlist>
  </b>
</a>

这就是我想要的:

<a>
  <header>
    fruit
  </header>
  <b>
    <fruitlist>
      <d>banana</d>
    </fruitlist>
    <fruitlist>
      <d>apple</d>
    </fruitlist>
  </b>
  <b>
    <fruitlist>
      <d>banana</d>
    </fruitlist>
  </b>
</a>

我的代码是这样的:

def removebanana(diretories):
xmlFiles = diretories + "/*.xml"
dirloc = directories + "/result"
for fname in glob.glob(xmlFiles):
    name = os.path.basename(fname)
    content = open(fname, "rt", encoding="utf-8", errors="ignore")
    
    root = tree.getroot()
    for b in root.findall("b"):
        dlist = []
        for b.find("d") is not None:
            d = str(drug.find("d").text)
            dlist.append(d)

        for dd in dlist:
            dd = dd.strip()
            if dd.lower() == "banana":
                cnt += 1
        if cnt == 0:
            root.remove(b)
            num += 0

    filename = f"{dirloc}/{name}"
    cnt += 1
    tree.write(filename)

但是,结果与示例输入文件相同。

如果我没理解错的话,这就是你需要做的:

fruits = """[your code above]"""
import xml.etree.ElementTree as ET
tree = ET.fromstring(fruits)
targets = tree.findall('.//b')
for target in targets:
    f_list= [t.text for t in target.findall('.//d')]
    if not "banana" in f_list:
        tree.remove(target)
print(ET.tostring(tree).decode())

#to write to file:
tree = ET.ElementTree(tree)
tree.write("test.xml", encoding="utf-8")

输出:

<a>
  <header>
    fruit
  </header>
  <b>
    <fruitlist>
      <d>banana</d>
    </fruitlist>
    <fruitlist>
      <d>apple</d>
    </fruitlist>
  </b>
  
  <b>
    <fruitlist>
      <d>banana</d>
    </fruitlist>
  </b>  
  </a>