从文件读取时出现无限循环问题

Issue with infinite loop when reading from file

我正在用 C# 编写程序以从文件中读取所有唯一单词以及每个单词在文件中出现的次数并输出到 csv 文件。我的问题是当我尝试 运行 我的程序时,我永远不会退出逐行运行的 while 循环。

public override List<WordEntry> GetWordCount()
{
        List<WordEntry> words = new List<WordEntry>();
        WordEntry wordEntry = new WordEntry();
        //string[] tokens = null;
        string line, temp, getword;
        int count = 0, index = 0;
        long number;

        while ((line = input.ReadLine()) != null)
        {
            if (line == null)
                Debug.Write("shouldnt happen");
            char[] delimit = { ' ', ',' };
            string[] tokens = line.Split(delimit);

            if (words.Count == 0)
            {
                wordEntry.Word = tokens[0];
                wordEntry.WordCount = 1;
                words.Add(wordEntry);
            }//end if

            for (int i = 0; i < tokens.Length; i++)
            {
                for (int j = 0; j < words.Count; j++)
                {
                    if (tokens[i] == words[j].Word)
                    {
                        number = words[j].WordCount;
                        number++;
                        getword = words[j].Word;
                        wordEntry.WordCount = number;
                        wordEntry.Word = getword;
                        words.RemoveAt(j);
                        words.Insert(j, wordEntry);
                    }//end if
                    else
                    {
                        wordEntry.Word = tokens[i];
                        wordEntry.WordCount = 1;
                        words.Add(wordEntry);
                    }//end else
                }//end for
            }//end for
        }//end while
        return words;
}

它卡在 while 循环中,就好像它永远不会到达文件末尾一样。该文件为 2.6 MB,因此它应该能够完成。

我想实际上您的代码并没有超出 "for (int j = 0; j < words.Count; j++)" 因为新项目一直被添加到单词列表中。

以下是重写代码以使用字典的方法。

var words = new Dictionary<string,int>();

while ((line = input.ReadLine()) != null)
{
    if (line == null)
        Debug.Write("shouldnt happen");
    char[] delimit = { ' ', ',' };
    string[] tokens = line.Split(delimit);

    foreach (var word in tokens)
    {
        if(words.ContainsKey(word))
            words[word]++;
        else
            words.Add(word, 1);
    }
}

这降低了代码的复杂性,因为字典的查找时间复杂度为 O(1)。

编辑

你可以像这样把字典转换成List<WordEntry>

return words
    .Select(kvp => new WorkEntry
        {
            Word = kvp.Key, 
            WordCount = kvp.Value
        })
    .ToList();