PDDocument.load(file) 不是方法 (PDFBox)

PDDocument.load(file) isnt a method (PDFBox)

我想做一个简单的程序,通过Java从pdf文件中获取文本内容。这是代码:

    PDFTextStripper ts = new PDFTextStripper();
    File file = new File("C:\Meeting IDs.pdf");
    PDDocument doc1 = PDDocument.load(file);
    String allText = ts.getText(doc1);
    String gradeText = allText.substring(allText.indexOf("GRADE 10B"), allText.indexOf("GRADE 10C"));
    System.out.println("Meeting ID for English: "
            + gradeText.substring(gradeText.indexOf("English") + 7, gradeText.indexOf("English") + 20));

这只是部分代码,但这是有问题的部分。 错误是:The method load(File) is undefined for the type PDDocument


我从 JavaTPoint 学会了使用 PDFBox。我已按照正确的说明安装 PDFBox 库并将它们添加到构建路径。 我的 PDFBox 版本是 3.0.0 我也搜索了源文件和他们的方法,我在那里找不到加载方法。

提前致谢。

根据 3.0 migration guidePDDocument.load 方法已替换为 Loader 方法:

For loading a PDF PDDocument.load has been replaced with the Loader methods. The same is true for loading a FDF document.

When saving a PDF this will now be done in compressed mode per default. To override that use PDDocument.save with CompressParameters.NO_COMPRESSION.

PDFBox now loads a PDF Document incrementally reducing the initial memory footprint. This will also reduce the memory needed to consume a PDF if only certain parts of the PDF are accessed. Note that, due to the nature of PDF, uses such as iterating over all pages, accessing annotations, signing a PDF etc. might still load all parts of the PDF overtime leading to a similar memory consumption as with PDFBox 2.0.

The input file must not be used as output for saving operations. It will corrupt the file and throw an exception as parts of the file are read the first time when saving it.

因此您可以切换到更早的 2.x 版本的 PDFBox,或者您需要使用新的 Loader 方法。我相信这应该有效:

File file = new File("C:\Meeting IDs.pdf");
PDDocument doc1 = Loader.loadPDF(file);