Apache PDFBox 在转换为 PNG 时删除水平线

Apache PDFBox Removes Horizontal Lines When Converting to PNG

我有一个 PDF,当我将它渲染为 png 时,它会删除水平线和垂直线。这是 PDF 及其外观:https://drive.google.com/file/d/1sAXwnaoZ-QJn1Kbpw85hhzV_X5zwgfkA/view?usp=sharing

这里是使用 PDFBox 2.0.13 的 PDF 的 PNG:

为什么要删除这些行,我怎样才能让它们在 PNG 中呈现?

问题(很可能)是您没有安装 JBIG2 图像格式的 Java ImageIO 插件,因为缺少的行和标题实际上是 JBIG2 图像。

当我 运行 没有此类插件的 PDFBox PDF 调试器并在其中打开您的 PDF 时,它也不会显示缺少的部分;将这样的插件添加到其类路径后,它突然显示了它们。

有关 PDFBox 依赖项的更多详细信息,请阅读 the PDFBox 2.0 Dependencies 页面。特别是

JAI Image I/O

PDF supports embedded image files, however support for some formats require third party libraries which are distributed under terms incompatible with the Apache 2.0 license:

These libraries are optional and will be loaded if present on the classpath, otherwise support for these image formats will be disabled and a warning will be logged when an unsupported image is encountered.

Maven dependencies for these components can be found in parent/pom.xml. Change the scope of the components if needed. Please make sure that any third party licenses are suitable for your project.

To include the JBIG2 library the following part can be included in your project pom.xml:

<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>jbig2-imageio</artifactId>
    <version>3.0.0</version>
</dependency>