如何使用STL中的容器读取文件

how to read files using containers in STL

我必须逐字阅读 "testdata.txt" 并在另一个文件 "dictionary.txt" 中查找相同的字词。我已经实现了在 ReadDictionary() 函数中读取 "dictionary.txt" 的代码。但是我必须实现 ReadTextFile() public 成员函数才能将名为 “testdata.txt” 的文件读入私有成员的“KnownWords”和“UnknownWords”数据中。 “KnownWords”中应仅包含“known”单词,其余单词应放入 "UnknownWords"。 我必须使用 map 和 pair 但我不知道如何在我的编程中使用它。有人可以帮我弄清楚这个以获得这个输出:

89 known words read.
49 unknown words read.

int main():

WordStats ws;
ws.ReadTxtFile();

头文件:

using namespace std;
typedef map<string, vector<int> > WordMap;     
typedef WordMap::iterator WordMapIter;        

class WordStats
{
public:
    WordStats();
    void ReadDictionary();
    void DisplayDictionary();
    void ReadTxtFile();
private:
    WordMap KnownWords;
    WordMap UnknownWords;
    set<string> Dictionary;
    char Filename[256];
};

这是我的程序:

WordStats::WordStats(){
strcpy(Filename,"testdata.txt");
}

// Reads dictionary.txt into Dictionary
void WordStats::ReadDictionary(){
    string word;    
    ifstream infile("dictionary.txt");
    if(!infile)
    {
        cerr << "Error Opening file 'dictionary.txt. " <<endl;
        exit(1);
    }
    while(getline(infile,word))
    {       
        transform (word.begin(), word.end(), word.begin(), ::tolower);
        Dictionary.insert(word); 
    }
    infile.close();
    cout << endl;
    cout << Dictionary.size() << " words read from dictionary. \n" <<endl;

}
// Reads textfile into KnownWords and UnknownWords
void WordStats::ReadTxtFile(){
    string words;
    vector<string> findword;
    vector<int> count;
    ifstream ifile(Filename);
    if(!ifile)
    {
        cerr << "Error Opening file 'dictionary.txt. " <<endl;
        exit(1);
    }
    while(!ifile.eof())
    {
        getline(ifile,words);
        //KnownWords.insert( pair<string,int>( KnownWords, words ) );
        findword.push_back(words);
        Paragraph = KnownWords.find(words);
        //stuck here
    }
    }

首先,您使用了错误的数据类型WordMap。以我的愚见,它应该只是 map<string, int>,因为你想计算一个词在你的文本中出现了多少次。

其次,您应该从文件中读取单词而不是整行文本。您可以使用以下代码来完成:

std::string word;
while (ifile >> word) {
    if (Dictionary.find(word) != Dictionary.end()) {
        // WordMap::value_type ... creates instance of std::pair object
        auto it = KnownWords.insert(KnownWords.end(), WordMap::value_type(word, 0));
        it->second++;
    } else {
        auto it = UnknownWords.insert(UnknownWords.end(), WordMap::value_type(word, 0));
        it->second++;
    }
}

您似乎需要检查词典以查看它是否包含您阅读的每个单词,然后选择 KnownWordsUnknownWords 中的哪个进行修改。

void WordStats::ReadTxtFile(){
    std::ifstream ifile(Filename);
    if(!ifile)
    {
        std::cerr << "Error Opening file " << Filename << std::endl;
        exit(1);
    }

我已经清理了你的本地声明,因此变量只存在必要的时间。

假设文件包含由空格和换行符分隔的单词,读取每个单词

    for (std::string word; ifile >> word; )
    {

小写

        transform (word.begin(), word.end(), word.begin(), ::tolower);

然后看看是不是在Dictionary

        if (Dictionary.count(word))
        {

记录KnownWords[word]中的位置。

            KnownWords[word].push_back(ifile.tellg());
        }
        else
        {

UnknownWords[word]

            UnknownWords[word].push_back(ifile.tellg()); 
        }
    }

然后显示其中的 sizes 以获得所需的输出。

    std::cout << KnownWords.size() << " known words read." << std::endl;
    std::cout << UnknownWords.size() << " unknown words read." << std::endl;
}

您可以用条件表达式替换重复操作的条件语句。注意 Words

声明中的引用类型
WordMap & Words = (Dictionary.count(word) ? KnownWords : UnknownWords);
Words[word].push_back(ifile.tellg()); 

作为一个完整的函数:

void WordStats::ReadTxtFile(){
    std::ifstream ifile(Filename);
    if(!ifile)
    {
        std::cerr << "Error Opening file " << Filename << std::endl;
        exit(1);
    }

    for (std::string word; ifile >> word; )
    {
        transform (word.begin(), word.end(), word.begin(), ::tolower);
        WordMap & Words = (Dictionary.count(word) ? KnownWords : UnknownWords);
        Words[word].push_back(ifile.tellg()); 
    }

    std::cout << KnownWords.size() << " known words read." << std::endl;
    std::cout << UnknownWords.size() << " unknown words read." << std::endl;
}