在 while 循环和段错误 C++ 中使用 erase()

Question

好的，所以我在这里遇到了一点问题。问题是这段代码可以在朋友的计算机上运行，但是当我尝试运行它时出现分段错误。

我正在读取一个如下所示的文件：

word 2 wor ord
anotherword 7 ano oth the her erw wor ord
...

而且我想解析文件中的每一个字。前两个单词（例如单词和 2）将被擦除，但将第一个单词保存在过程中的另一个变量中。

我仔细研究了一下如何实现这一点，然后我想出了这段半途而废的代码，它似乎可以在我朋友的计算机上运行，但不能在我的计算机上运行。

Dictionary::Dictionary() {
    ifstream ip;
    ip.open("words.txt", ifstream::in);
    string input;
    string buf;
    vector<string> tokens; // Holds words
    while(getline(ip, input)){
        if(input != " ") {
            stringstream ss(input);
            while(ss >> buf) {
                tokens.push_back(buf);
        }
        string werd = tokens.at(0);
        tokens.erase(tokens.begin()); // Remove the word from the vector
        tokens.erase(tokens.begin()); // Remove the number indicating trigrams
        Word curr(werd, tokens); 
        words[werd.length()].push_back(curr); // Put the word at the vector with word length i.
        tokens.clear();
    }
}
ip.close();
}

解析文件中的这种结构并删除前两个元素但保存其他元素的最佳做法是什么？如您所见，我正在制作一个 Word 对象，其中包含一个字符串和一个向量供以后使用。

此致

编辑；它似乎可以很好地添加第一行，但是在删除第二个元素时，它会因分段错误而崩溃。

编辑； words.txt 包含这个：

addict 4 add ddi dic ict 
sinister 6 ini ist nis sin ste ter 
test 2 est tes 
cplusplus 7 cpl lus lus plu plu spl usp

没有前导空格或结尾空格。并不是说它一直在读。

Word.cc:

#include <string>
#include <vector>
#include <algorithm>
#include "word.h"

using namespace std;

Word::Word(const string& w, const vector<string>& t) : word(w), trigrams(t) {}

string Word::get_word() const {
    return word;
}

unsigned int Word::get_matches(const vector<string>& t) const {
    vector<string> sharedTrigrams;
    set_intersection(t.begin(),t.end(), trigrams.begin(), trigrams.end(), back_inserter(sharedTrigrams));
    return sharedTrigrams.size();
}

Answer 1

您忘记在代码中包含变量 "words" 的初始化。看看它，我猜你正在将 "words" 初始化为一个固定长度的向量数组，但随后读取了一个位于数组末尾的单词。砰，你死定了。对 "werd.length()" 添加检查以确保它严格小于 "words".

的长度

Answer 2

首先，您发布的代码中关闭 } 的次数有误。如果你正确地缩进它们，你会看到你的代码是：

while(getline(ip, input))
{
   if(input != " ") 
   {
      stringstream ss(input);
      while(ss >> buf) {
         tokens.push_back(buf);
      }
   }
   string werd = tokens.at(0);
   tokens.erase(tokens.begin());
   tokens.erase(tokens.begin());
   Word curr(werd, tokens); 
   words[werd.length()].push_back(curr);
   tokens.clear();
}
}

假设这是发帖时的一个小错别字，另一个问题是当 input == " " 时 tokens 是一个空列表，但您继续使用 tokens 就好像它有 2 或里面有更多的项目。

您可以通过移动 if 语句中的所有内容来解决这个问题。

while(getline(ip, input))
{
   if(input != " ") 
   {
      stringstream ss(input);
      while(ss >> buf) {
         tokens.push_back(buf);
      }

      string werd = tokens.at(0);
      tokens.erase(tokens.begin());
      tokens.erase(tokens.begin());
      Word curr(werd, tokens); 
      words[werd.length()].push_back(curr);
      tokens.clear();
   }
}

我会添加进一步的检查以使其更健壮。

while(getline(ip, input))
{
   if(input != " ") 
   {
      stringstream ss(input);
      while(ss >> buf) {
         tokens.push_back(buf);
      }

      string werd;

      if ( !tokens.empty() )
      {
         werd = tokens.at(0);
         tokens.erase(tokens.begin());
      }

      if ( !tokens.empty() )
      {
         tokens.erase(tokens.begin());
      }

      Word curr(werd, tokens); 
      words[werd.length()].push_back(curr);
      tokens.clear();
   }
}

Answer 3

ifstream ip;
ip.open("words.txt", ifstream::in);
string input;
while(getline(ip, input)){
   istringstream iss(input);
   string str;
   unsigned int count = 0;
   if(iss >> str >> count) {
     vector<string> tokens { istream_iterator<string>(iss),    istream_iterator<string>() }; // Holds words
  if(tokens.size() == count) 
        words[str.length()].emplace_back(str, tokens);
  }      
}
ip.close();

这就是我用来让它工作的。

在 while 循环和段错误 C++ 中使用 erase()

Using erase() in a while loop and segfault C++

c++

string

parsing

getline