PDFBox 加入两个 pdfside by side 优化磁盘 space

PDFBox join two pdfside by side optimizing disk space

我正在使用 PDFBox 并排加入两个 PDF。
我正在使用以下代码:

PDDocument outDoc = new PDDocument();

int maxPages = targetDoc.getNumberOfPages();
if (sourceDoc.getNumberOfPages() > targetDoc.getNumberOfPages()) {
    maxPages = sourceDoc.getNumberOfPages();
}
PDPage sourceIndexPage;
PDPage targetIndexPage;
PDRectangle pdf1Frame;
PDRectangle pdf2Frame;
PDRectangle outPdfFrame;
COSDictionary dict;
PDPage outPdfPage;
LayerUtility layerUtility;
PDFormXObject sourceFormPDF;
PDFormXObject targetFormPDF;
AffineTransform afLeft;
AffineTransform afRight;

for (int indexPage = 0; indexPage < maxPages; indexPage++) {

    // Create output PDF frame
    try {
        sourceIndexPage = sourceDoc.getPage(indexPage);
    } catch (IndexOutOfBoundsException error) {
        sourceDoc.addPage(new PDPage());
        sourceIndexPage = targetDoc.getPage(indexPage);
    }

    try {
        targetIndexPage = targetDoc.getPage(indexPage);
    } catch (IndexOutOfBoundsException error) {
        targetDoc.addPage(new PDPage());
        targetIndexPage = targetDoc.getPage(indexPage);
    }

    sourceIndexPage.setRotation(0);
    targetIndexPage.setRotation(0);

    pdf1Frame = sourceIndexPage.getCropBox();
    pdf2Frame = targetIndexPage.getCropBox();
    outPdfFrame = new PDRectangle(pdf1Frame.getWidth() + pdf2Frame.getWidth(),
            Math.max(pdf1Frame.getHeight(), pdf2Frame.getHeight()));

    // Create output page with calculated frame and add it to the document
    dict = new COSDictionary();
    dict.setItem(COSName.TYPE, COSName.PAGE);
    dict.setItem(COSName.MEDIA_BOX, outPdfFrame);
    dict.setItem(COSName.CROP_BOX, outPdfFrame);
    dict.setItem(COSName.ART_BOX, outPdfFrame);
    outPdfPage = new PDPage(dict);
    outDoc.addPage(outPdfPage);

    // Source PDF pages has to be imported as form XObjects to be able to insert them at a specific point in the output page
    // pageNumber
    layerUtility = new LayerUtility(outDoc);
    sourceFormPDF = layerUtility.importPageAsForm(sourceDoc, indexPage);
    targetFormPDF = layerUtility.importPageAsForm(targetDoc, indexPage);

    // Add form objects to output page
    afLeft = new AffineTransform();
    layerUtility.appendFormAsLayer(outPdfPage, sourceFormPDF, afLeft, "left " + indexPage);
    afRight = AffineTransform.getTranslateInstance(pdf1Frame.getWidth(), 0.0);
    layerUtility.appendFormAsLayer(outPdfPage, targetFormPDF, afRight, "right" + indexPage);
}

outDoc.save("oudDoc.pdf");

我遇到的问题是,对于某些文档,outDoc 的大小过大。我原以为它是模糊的源文档 + 模糊的目标文档,但实际上是 10 倍、20 倍。

查看文档的结构,我注意到我重复了原始 PDF 中分开的公共资源。有没有办法 compress/optimize 我的代码在磁盘上的 space 更少?

我们通过使用 ghostscript 后处理生成的 pdf 解决了这个问题