使用 ITextRenderer 将 html 转换为 PDF 的阿拉伯语问题

Arabic problems with converting html to PDF using ITextRenderer

当我使用 ITextRenderer 将 html 转换为 PDF.this 是我的代码

ByteArrayOutputStream out = new ByteArrayOutputStream();

ITextRenderer renderer = new ITextRenderer();
String inputFile = "C://Users//Administrator//Desktop//aaa2.html";
String url = new File(inputFile).toURI().toURL().toString();
renderer.setDocument(url);
renderer.getSharedContext().setReplacedElementFactory(
        new B64ImgReplacedElementFactory());
    // 解决阿拉伯语问题
ITextFontResolver fontResolver = renderer.getFontResolver();
try {
    fontResolver.addFont("C://Users//Administrator//Desktop//arialuni.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
} catch (DocumentException e) {
    e.printStackTrace();
}

renderer.layout();
OutputStream outputStream = new FileOutputStream("C://Users//Administrator//Desktop//HTMLasPDF.pdf");
renderer.createPDF(outputStream, true);
/*PdfWriter writer = renderer.getWriter();

writer.open();
writer.setRunDirection(PdfWriter.RUN_DIRECTION_RTL);
OutputStream outputStream2 = new FileOutputStream(  "C://Users//Administrator//Desktop//HTMLasPDFcopy.txt");
renderer.createPDF(outputStream2);*/
renderer.finishPDF();
out.flush();
out.close();

实际 PDF 结果:

预期 PDF 结果:

如何制作阿拉伯文连字?

希腊字符似乎被省略了;他们没有出现在文档中。

In flying saucer the generated PDF uses some kind of default (probably Helvetica) font, that contains a very limited character set, that obviously does not contain the Greek code page. link

如果你想正确地做到这一点(我假设使用 iText,因为你的 post 被标记为这样),你应该使用

  • iText7
  • pdfHTML(将 HTML 转换为 PDF)
  • pdfCalligraph(正确处理阿拉伯语连字)
  • 支持这些功能的字体(如另一个答案所示)

示例请参考HTML to PDF tutorial, more specifically the following FAQ item: How to convert HTML containing Arabic/Hebrew characters to PDF?

您需要包含所需字形的字体,例如:

public static final String[] FONTS = {
    "src/main/resources/fonts/noto/NotoSans-Regular.ttf",
    "src/main/resources/fonts/noto/NotoNaskhArabic-Regular.ttf",
    "src/main/resources/fonts/noto/NotoSansHebrew-Regular.ttf"
};

并且您需要 FontProvider 知道如何在 ConverterProperties:

中找到这些字体
public void createPdf(String src, String[] fonts, String dest) throws IOException {
    ConverterProperties properties = new ConverterProperties();
    FontProvider fontProvider = new DefaultFontProvider(false, false, false);
    for (String font : fonts) {
        FontProgram fontProgram = FontProgramFactory.createFont(font);
        fontProvider.addFont(fontProgram);
    }
    properties.setFontProvider(fontProvider);
    HtmlConverter.convertToPdf(new File(src), new File(dest), properties);
}

请注意,如果您想了解有关连字的更多信息,如果您没有pdfCalligraph add-on. That add-on didn't exist at the time Flying Saucer was created, hence you can't use Flying Saucer for converting documents with text in Arabic, Hindi, Telugu,... Read the pdFCalligraph white paper,文本将完全错误。

我改用wkhtmltopdf转pdf的方式