带有 PDFBox 和 JPEG 2000 样本的图像类型未知

Image type UNKNOWN with PDFBox and JPEG 2000 sample

我从 fnord examples page.

中提取了样本 JPEG 2000

但是,当我尝试将该图像添加到 PDF 时:

PDDocument document = new PDDocument();
PDImageXObject pdImage = pdImage = PDImageXObject.createFromFileByContent(
   "samples/relax.jp2", document);
PDPage page = new PDPage(new PDRectangle(pageWidth, pageHeight));
PDPageContentStream contentStream = new PDPageContentStream(document, page);
contentStream.drawImage(pdImage, matrix);
contentStream.close();

我得到异常:

Caused by: java.lang.IllegalArgumentException: Image type UNKNOWN not supported: relax.jp2 at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.createFromFileByContent(PDImageXObject.java:313)

我在 Maven 中的 PDFBox 依赖项:

    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>pdfbox</artifactId>
        <version>2.0.12</version>
    </dependency>
    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>fontbox</artifactId>
        <version>2.0.12</version>
    </dependency>
    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>jempbox</artifactId>
        <version>1.8.16</version>
    </dependency>       
    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>jbig2-imageio</artifactId>
        <version>3.0.2</version>
    </dependency>
    <dependency>
        <groupId>com.github.jai-imageio</groupId>
        <artifactId>jai-imageio-core</artifactId>
        <version>1.4.0</version>
    </dependency>
    <dependency>
        <groupId>com.github.jai-imageio</groupId>
        <artifactId>jai-imageio-jpeg2000</artifactId>
        <version>1.3.0</version>
    </dependency>

我是不是做错了什么?或者 PDFBox and/or 我正在使用的样本有问题?

其他 Apache 库 Tika 将此示例文件的 MIME 类型检测为 image/jp2:

TikaConfig tika = new TikaConfig();
Metadata metadata = new Metadata();
MediaType mimetype = tika.getDetector().detect(
     TikaInputStream.get(new FileInputStream("samples/relax.jp2"), metadata);

来自 PDFBox 的 API documentation:

createFromFileByContent()
The following file types are supported: jpg, jpeg, tif, tiff, gif, bmp and png.

查看源代码,createFromFileByContent()内部调用的是他们自己对已知文件类型的检查,独立于底层库,检测代码看起来像这个:FileTypeDetector.java.

此检查无法识别 JPEG 2000

实际上 createFromFileByExtension() 可能是更好的选择:

if ("gif".equals(ext) || "bmp".equals(ext) || "png".equals(ext)) {
    BufferedImage bim = ImageIO.read(file);
    return LosslessFactory.createFromImage(doc, bim);
}

只要你假装你有 GIFBMPPNG 和你的ImageIO 支持 JPEG 2000,这可能有点工作(未测试)。