如何在一行 CSV 中获取 PDF 输出

How to get PDF output in one CSV line

在我的程序中,csv 为每个输入创建一个新行。喜欢:

有没有办法在一行中全部搞定?

我当前的代码:

static void Main(string[] args)
{
    string path = @"C:\Users\burak\Desktop\todo";
    StreamWriter write = new StreamWriter(@"C:\Users\burak\Desktop\todo\test.csv");
    foreach (var file in Directory.GetFiles(path, "*.pdf", SearchOption.TopDirectoryOnly))
    {
        StringBuilder text = new StringBuilder();
        PdfReader pdfReader = new PdfReader(file);
        string currentText ="";

        for (int page = 1; page <= pdfReader.NumberOfPages; page++)
        {
            ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
            currentText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
            currentText = string.Join(";", currentText.Split(' ', ':', '/'));
            currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(currentText)));
            // text.Append(currentText);
            pdfReader.Close();
        }
        
        text.ToString();
        write.Write(currentText);
        Console.WriteLine(text.ToString());
    }
    write.Close();
}

我试过的:

获取空格以将其合并为一行,但这根本不起作用..

要删除所有换行符,我们可以将它们替换为空字符串。要获取当前系统的新行,请使用 System.Environment.NewLine。现在所有页面的所有 PDF 文本都在同一行上。现在要为每个新的 PDF 文件添加一个换行符,我们可以在字符串末尾添加一个 System.Environment.NewLine,然后将整个 PDF 写入 CSV 文件。

示例:

static void Main(string[] args) {
    // ...
    StreamWriter write = new StreamWriter(@"C:\Users\burak\Desktop\todo\test.csv");
    // ...

    foreach (var file in Directory.GetFiles(path, "*.pdf", SearchOption.TopDirectoryOnly)) {
        // ...

        for (int page = 1; page <= pdfReader.NumberOfPages; page++) {
            // ...
            currentText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
            // ...
        }

        // Replace newLines
        currentText = currentText.Replace(System.Environment.NewLine, string.Empty);
        // Add newLine to currentText
        currentText += System.Environment.NewLine;
        write.Write(currentText);
    }
    write.Close();
}

可能输入的文本中有回车或换行。你可以试试这个:

write.Write(currentText.Replace("\r", "").Replace("\n", ""));