iText 中的 PDFSmartCopy 如何检测相同的资源？

How PDFSmartCopy from iText detects same resources?

pdf
itext
itextsharp

我想知道在不深入研究 PDF 智能复制源代码的情况下，哪些 "same" 资源能够检测和重用。

使用子集字体和不同的条形码我知道这几乎是不可能的，PDFSmartCopy 没有检测到它。

但是图像和表格呢 - 它将如何检查 "same" 资源？

有没有人能简单描述一下使用了什么启发式以及检查了PDF中的哪种资源？

有一个 great answer directly on the iText website 关于这个：

How is this possible? PdfSmartCopy takes a hash of every stream object that is encountered and keeps those hashes in memory. If PdfSmartCopy detects that you try to add the same stream twice, a reference to the first stream will be used instead of adding a redundant stream.

iText 中的 PDFSmartCopy 如何检测相同的资源？

How PDFSmartCopy from iText detects same resources?

pdf

itext

itextsharp