C++ 中的字符串标记化会引发段错误

string tokenization in c++ throws a seg fault

我想编写一个按标记分解字符串的函数,到目前为止我想出了以下方法:

#include <cstring>
#include <iostream>
#include <vector>
#define MAXLEN 20

void mytoken(std::string input, std::vector<std::string> & out);

int main() 
{
    std::vector<std::string> out;
    std::string txt = "XXXXXX-CA";
    mytoken(txt, out);
    std::cout << "0: " << out[0] <<std::endl;
    std::cout << "1: " << out[1] <<std::endl;
}

void mytoken(std::string instr, std::vector<std::string> & out) {
    std::vector<std::string> vec;
    char input[MAXLEN] = {0};
    strcpy(input, instr.c_str());
    char *token = std::strtok(input, "-");
    while (token != NULL) {
        std::cout << token << '\n';
        token = std::strtok(NULL, "-");
        out.push_back(token);
    }    
}

产生以下输出:

terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_M_construct null not valid
XXXXXX
CA
bash: line 7: 21987 Aborted                 (core dumped) ./a.out

我想知道为什么会这样。

最好使用'c++-style'函数:它更简单,更易读:

#include <sstream>

void mytoken(std::string instr, std::vector<std::string> & out)
{
    std::istringstream ss(instr);
    std::string token;
    while(std::getline(ss, token, '-'))
    {
        std::cout << token << '\n';
        out.push_back(token);
    }
}

为了让您的示例正常工作,您需要更改循环中的操作顺序:

//...
while(token != NULL)
{
    out.push_back(token);
    std::cout << token << '\n';
    token = std::strtok(NULL, "-");
}