我无法读取所有 Rtf 文件内容

Question

我有一个 Rtf 文件，我需要将文件读取到解析器。文件中有一些特殊字符，因为文件中有图像。当我从文件中读取所有文本时，无法读取特殊字符后的内容。

我尝试使用 ReadAllText 和 Encoding.UTF8 以及 Encoding.ASCII

读取文件

public class ReadFile
{
    public static string GetFileContent(string path)
    {
        if (!File.Exists(path))
        {
            throw new FileNotFoundException();
        }
        else
        {
            // I also tried 
            // return File.ReadAllText(path, Encoding.ASCII);
            string text = string.Empty;
            var fileStream = new FileStream(path, FileMode.Open, FileAccess.Read);
            using (var streamReader = new StreamReader(fileStream, Encoding.UTF8))
            {
                string line;
                while ((line = streamReader.ReadLine()) != null)
                {
                    text += line;
                }
            }
            return text;
        }
    }
}

实际上我的结果是所有文本，直到开始特殊字符。

{\rtf1\ansi\ansicpg1252\deff0\deftab720{\fonttbl{\f0\fnil Times New Roman;}{\f1\fnil Arial;}}{\colortbl;\red000\green000\blue000;\red255\green000\blue000;\red128\green128\blue128;}\paperw11905\paperh16837\margl360\margr360\margt360\margb360 \sectd \sectdefaultcl \marglsxn360\margrsxn360\margtsxn360\margbsxn360{ {*\do\dobxpage\dobypage\dodhgt8192\dptxbx{\dptxbxtext\pard\plain {\pict\wmetafile8\picw19499\pich1746\picwgoal1305695\pichgoal116957 \bin342908

Rtf File is here

Answer 1

我做了。为了读取文件，我使用 File.ReadAllBytes(path) 并在结果变量中将字节 0 替换为 (nul)，将字节 27 替换为 esc.

byte[] fileBytes = File.ReadAllBytes(path);

StringBuilder sb = new StringBuilder();
foreach (var b in fileBytes)
{
    // handle printable characters
    if ((b >= 32) || (b == 10) || (b == 13) || (b == 9)) // lf, cr, tab
        sb.Append((char)b);
    else
    {
        // handle control characters
        switch (b)
        {
            case 0: sb.Append("(nul)"); break;
            case 27: sb.Append("(esc)"); break;
                // etc.
        }
    }
}

return sb.ToString();

I found the help in

我无法读取所有 Rtf 文件内容

I can't read all Rtf file content

c#

rtf

stream