在 ElementTree 中抑制命名空间

Question

给定一个如下所示的 xml 文件：

<?xml version="1.0" encoding="windows-1252"?>
<Message xmlns="http://example.com/ns" xmlns:myns="urn:us:gov:dot:faa:aim:saa">
  <foo id="stuffid"/>
  <myns:bar/>
</Message>

当我用 ElementTree 解析它时，元素标签看起来像：

{http://example.com/ns}Message
  {http://example.com/ns}foo
  {urn:us:gov:dot:faa:aim:saa}bar

但我宁愿只

Message
  foo
  bar

更重要的是，我宁愿将 "Message"、"foo" 和 "bar" 传递给 find() 和 findall() 方法。

我已经尝试使用替换来审查中建议的所有 xmlns: 属性（如果我找不到更优雅的东西，这可能是我必须做的），我试过调用 ElementTree.register_namespace('', "http://example.com/ns")，但这似乎只对 ElementTree.tostring() 有帮助，这不是我想要的。

难道没有办法让 ElementTree 假装它从未听说过 xmlns？

让我们假设即使没有名称空间限定符，我的元素标签也是全局唯一的。在这种情况下，名称空间只是妨碍。

详细解决一些评论：

Joe 链接到 Python ElementTree module: How to ignore the namespace of XML files to locate matching element when using the method "find", "findall"，这与我的问题足够接近，我猜我的问题是重复的。然而，这个问题也没有得到回答。那里给出的建议是：

使用tree.findall("xmlns:DEAL_LEVEL/xmlns:PAID_OFF", namespaces={'xmlns': 'http://www.test.com'})。
- 我在 https://docs.python.org/2/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.findall 中找不到带有这些参数的那个调用的文档，无论如何它要求我知道所有的命名空间。
预处理输入 XML 并如上所述从输入中去除 xmlns 属性。
Post-处理解析后的文档并从标签中去除所有命名空间。
- 坦率地说，我最喜欢这种方法。我将 post 代码作为答案。
使用register_namespace("", "http://example.com/ns")
- 这会在使用 ElementTree.tostring(el) 而非 el.tag 时抑制命名空间。我预计它对 find() 或 findall() 也没有帮助。
- 同样，这并没有解决我需要提前知道所有名称空间（或以某种方式从文档中提取它们）的问题。

Answer 1

好的，感谢您提供其他问题的链接。我决定借鉴（并改进）one of the solutions given there:

def stripNs(el):
  '''Recursively search this element tree, removing namespaces.'''
  if el.tag.startswith("{"):
    el.tag = el.tag.split('}', 1)[1]  # strip namespace
  for k in el.attrib.keys():
    if k.startswith("{"):
      k2 = k.split('}', 1)[1]
      el.attrib[k2] = el.attrib[k]
      del el.attrib[k]
  for child in el:
    stripNs(child)

在 ElementTree 中抑制命名空间

Suppress namespace in ElementTree

python

xml

elementtree