解析前尝试检查 XML 中是否存在标记

Trying to check if a tag exists in XML before parsing

我需要在解析 XML 文件之前检查某些标签是否存在;我在 Python 中使用 Element Tree。阅读 here,我试着写这个:


tgz_xml = f"https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?id=PMC8300416" 
response = urllib.request.urlopen(tgz_xml).read()
tree = ET.fromstring(response)


for OA in tree.findall('OA'):
  records = OA.find('records')
  if records is None:
    print('records missing')
  else:
    print('records found')

我需要检查“记录”标签是否存在。我没有收到错误,但这不会打印出任何内容。我做错了什么? 谢谢!

解析此 XML 文档时变量 tree 已经指向元素 OA,因此在搜索此元素时表达式 tree.findall('OA') returns 是一个空列表并且不执行循环。删除该行,代码将被执行:

import xml.etree.ElementTree as ET 
from urllib.request import urlopen

tgz_xml = f"https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?id=PMC8300416" 
with urlopen(tgz_xml) as conn:
  response = conn.read()
  tree = ET.fromstring(response)

  records = tree.find('records')
  if records is None:
    print('records missing')
  else:
    print('records found')