将大文本文件拆分为较小的文本文件

splitting a large text file into smaller text files

我正在尝试根据包含大约 600 万行的行数拆分文本文件,并且每个文件应始终以特定标识符结尾(最后一行)。 我尝试了什么:

    using (System.IO.StreamReader sr = new System.IO.StreamReader(inputfile))
    {
        int fileNumber = 0;
        string line = "";
        while (!sr.EndOfStream)
        {
            int count = 0;
            //identifier = sr.ReadLine().Substring(0,2);
            using (System.IO.StreamWriter sw = new System.IO.StreamWriter(inputfile + ++fileNumber + ".TXT"))
            {
                sw.AutoFlush = true;
                

                while (!sr.EndOfStream && ++count < 1233123)
                {
                    line = sr.ReadLine();
                    sw.WriteLine(line);
                }
       //having problems starting here not sure how to implement the other condition   == "JK"
                line = sr.ReadLine();
                if (count > 1233123 && line.Substring(0,2) == "JK")
                {
                    sw.WriteLine(line);
                }
                else
                {
                    while (!sr.EndOfStream && line.Substring(0,2) != "JK")
                    {
                        line = sr.ReadLine();
                        sw.WriteLine(line);
                    }
                }
               
            }
        }
    }

示例输入文本如下:

AAadsadasdasdasdfsdfsdfs
Bbasfafasfasdfdsfsdfsdff
CCsafsdfasdadfasdfasfsaf
DDasdsfsdfsafdsadfsafasf
JKdfgdsgdsfgsdfgsfgdfgdf
AAfsdfsadfsdfsaadfadasda
BBadfasdfasdfdsfasfasdas
CCadasdsfasdfasfasfasfds
DDsdfsdafasdfsdfdsfsdfsd
EEsadfsfsasafasdfsdfsdfs
FFasfasfadsdfdsadssfsdfs
JKadsadasdasdadsadasdasa
AAadasdasdasdasdasdasdas
BBasdadadadasdasdasdasdd
CCadasdasdasdasdasdasdad
JKsafsdfsdfasfasdfdasfsa

基本上我想要实现的是有多个至少有 1233123 行或更多的文本文件(即如果第 1233123 行没有“JK”然后继续写入当前文件直到找到它)。

在读取和写入文件时检查您的条件,大于 1233123 的行号和以 JK 开头的行是否为真。在这种情况下,您可以停止写入文件片段并继续最外层循环的下一次迭代,该循环开始写入下一个文件。

using (System.IO.StreamWriter sw = new System.IO.StreamWriter(inputfile + ++fileNumber + ".TXT"))
{
    sw.AutoFlush = true;                

    while (!sr.EndOfStream)
    {
        line = sr.ReadLine();
        sw.WriteLine(line);

        if(++count > 1233123 && line.Substring(0,2) == "JK")
        {
            break;
        }
    }
}