如何在 Python 的 xlrd 中防止 "billion laughs" DoS 攻击？

Question

Billion Laughs DoS 攻击似乎可以通过简单地阻止扩展 XML 文件中的实体来预防。在 Python 的 xlrd 库中有没有办法做到这一点（即某种标志）？如果没有，是否有推荐的避免攻击的方法？

Answer 1

不单独使用 xlrd

此时 xlrd 中没有选项可以防止任何类型的 XML 炸弹。在the source code中，xlsx数据被传递给python的内置xml.etree进行解析，没有任何验证：

import xml.etree.ElementTree as ET

def process_stream(self, stream, heading=None):
        if self.verbosity >= 2 and heading is not None:
            fprintf(self.logfile, "\n=== %s ===\n", heading)
        self.tree = ET.parse(stream)

但是，可以使用 defusedxml

修补 ElementTree

如评论中所述，defusedxml 是一个直接针对不同类型 XML 炸弹的安全问题的软件包。来自文档：

Instead of:

from xml.etree.ElementTree import parse
et = parse(xmlfile)

alter code to:

from defusedxml.ElementTree import parse
et = parse(xmlfile)

同时提供标准库补丁功能。因为这就是 xlrd 正在使用的，您可以使用 xlrd 和 defusedxml 的组合来读取 Excel 文件，同时保护自己免受 XML 炸弹。

Additionally the package has an untested function to monkey patch all stdlib modules with defusedxml.defuse_stdlib().

如何在 Python 的 xlrd 中防止 "billion laughs" DoS 攻击？

How to prevent "billion laughs" DoS attack in Python's xlrd?

python

xml

xlrd

xlsx

client-side-attacks

不单独使用 xlrd

但是，可以使用 defusedxml