C++ unordered_map emplace() 函数抛出段错误,我不知道为什么

C++ unordered_map emplace() function throwing seg fault and I have no idea why

我在使用 std::unordered_map::emplace() 时遇到段错误。这是最小的可重现示例:

#include <iostream>
#include <string>
#include <unordered_map>
using namespace std;

class WordTable {
public:
  WordTable() {
    total = 0;
  }
  ~WordTable() {}

  void addWord(const string word, const int incr = 1) {
    cout << "begin emplace" << endl;
    table.emplace(word, Node()); //this is where the seg fault occurs
    cout << "emplace succeeded" << endl;
    if (incr) {
      table[word].incrementCount();
      incrementTotal();
    }
  }
private:
  struct Node {
  public:
    Node() {
      count = 0;
      kids = new WordTable();
    }
    ~Node() {
      delete kids;
    }
    int incrementCount() {
      return ++count;
    }
  private:
    int count;
    WordTable* kids;
  };

  int incrementTotal() {
    return ++total;
  }
  int total;
  unordered_map<string, Node> table;
};

int main() {
  WordTable tmp;
  cout << "add word 1" << endl;
  tmp.addWord("Hi");
  cout << "end add word 1" << endl;
  tmp.addWord("Hi");
  cout << "end add word 2" << endl;

  WordTable* test = new WordTable();
  cout << "add word 3" << endl;
  test->addWord("Hi");
  cout << "end add word 3" << endl;
}

以及对应的输出:

add word 1
begin emplace
emplace succeeded
end add word 1
begin emplace
emplace succeeded
end add word 2
add word 3
begin emplace
Segmentation fault (core dumped)

段错误发生在第三次调用addWord()时调用.emplace()

应该发生的是 addWord("Hi") 添加到 std::unordered_map<std::string, Node> table 的映射。该映射应将 "Hi" 作为键值,并将 Node() 对象作为映射值。

这是第一个奇怪的部分:如果在第三次调用之前我只调用了一次 addWord(),则没有段错误。这是输出:

add word 1
begin emplace
emplace succeeded
end add word 1
end add word 2
add word 3
begin emplace
emplace succeeded
end add word 3

第二个奇怪的部分是,如果我静态分配 test,那么也没有段错误。这是它的输出:

add word 1
begin emplace
emplace succeeded
end add word 1
begin emplace
emplace succeeded
end add word 2
add word 3
begin emplace
emplace succeeded
end add word 3

我不知道发生了什么,也不知道为什么会这样。我只是不明白在 STL unordered_map::emplace() 中怎么会出现段错误。我能想到的唯一问题是我在 addWord() 中创建 Node() 的方式,但我不明白这将如何使 addWord() 在前两个调用中取得成功,但 seg错在第三。

如有任何帮助,我将不胜感激!!!

Node中,您在构造函数和析构函数中分配和释放WordTable *kids,但它将具有默认复制构造函数和运算符。 这些只会复制指针本身,不会创建新对象,例如:

Node(const Node &cp) // default
    : count(cp.count), kids(cp.kids) // no "new"!
{}

当这些副本中的第一个被破坏时,指针被删除,留下一个无效指针的其他副本,这有望在访问时崩溃(替代方案通常是某种形式的堆损坏)。在这种情况下,第二次访问似乎因编译器而异,GCC 似乎 运行 由于制作了额外的副本而进入 emplace 内的问题,MSVC 直到进入 ~WordTable() 时才开始 returns(WordTable tmp 堆栈变量)。

您可以通过跟踪 new/delete:

来查看
Node() {
  count = 0;
  kids = new WordTable();
  cout << "new kids " << kids << endl;
}
~Node() {
  cout << "delete kids " << kids << endl;
  delete kids;
}
// GCC
add word 1
begin emplace
new kids 0xa38c30
delete kids 0xa38c30 // was deleted inside emplace, but when `~WordTable` happens later (if it got that far) will be deleted a second time, the one stored in the WordTable instance.
emplace succeeded
end add word 1
begin emplace
new kids 0xa38c30 // Can get the same pointer as 0xa38c30 is now "free"
delete kids 0xa38c30 // and again
delete kids 0xa38c30 // this time twice due to some GCC specific implementation detail, when the value "Hi" already existed so the value was no longer wanted
emplace succeeded
end add word 2
add word 3
begin emplace
new kids 0xa38cf0 // same pointer again
SIGSEGV // this time not so lucky, maybe because that double deletion just above corrupted something

您可以通过 "deleting" 构造函数运算符来防止默认复制:

Node(const Node &) = delete;
Node &operator = (const Node &) = delete;

这会将 table.emplace(word, Node()); 变成编译错误,因为这是复制发生的地方。 尽管您调用了 emplace,但您向它传递了一个完整的临时对象,因此它会尝试并放置到复制构造函数 Node(const Node &)。 你想用 emplace 做的是将构造函数参数传递给它,这对于默认构造函数的情况有点棘手,一个简单的 table.emplace(word) 不会编译:

table.emplace(std::piecewise_construct, std::make_tuple(word), std::make_tuple());

或者,如果您希望您的对象可以安全地复制,请显式实现这些功能。

Node(const Node &cp)
    : count(cp.count), kids(new WordTable()) // create a full new object this time
{
    *kids = *cp.kids;
    cout << "copy constructor " << kids << endl;
}
Node &operator = (const Node &cp)
{
    *kids = *cp.kids;
    cout << "copy operator " << kids << endl;
    return *this;
}
add word 1
begin emplace
new kids 0xee8c30
copy constructor 0xee8cd0 // this time made a new object
delete kids 0xee8c30 // deleted the original but 0xee8cd0 is still valid
emplace succeeded
end add word 1
begin emplace
new kids 0xee8c30
copy constructor 0xee8d90
delete kids 0xee8d90
delete kids 0xee8c30
emplace succeeded
end add word 2
add word 3
begin emplace
new kids 0xee8d40
copy constructor 0xee8de0
delete kids 0xee8d40
emplace succeeded
end add word 3
delete kids 0xee8cd0 // that first copy getting deleted when main returns

WordTable 的副本很好,因为 unordered_map<string, Node> 将使用刚刚提供的那些单独复制每个 key/value。

另一个类似的替代方法是提供合适的移动构造函数和运算符,或者与副本一起提供,或者删除副本。

Node(const Node &cp) = delete;
Node &operator = (const Node &cp) = delete;
Node(Node && mv)
    : count(mv.count), kids(mv.kids)
{
    mv.kids = nullptr; // took kids away from mv
    cout << "move constructor " << kids << endl;
}
Node &operator = (Node &&mv)
{
    swap(count, mv.count);
    swap(kids, mv.kids);
    cout << "move operator " << kids << " from " << mv.kids << endl;
    return *this;
}
add word 1
begin emplace
new kids 0x1c4cc30 // temporary object
move constructor 0x1c4cc30 // final emplace value
delete kids 0 // moved it out, so nothing is deleted here
emplace succeeded
end add word 1
begin emplace
new kids 0x1c4ccf0
move constructor 0x1c4ccf0
delete kids 0x1c4ccf0  // delete due to duplicate "Hi"
delete kids 0 // again it was moved, so empty
emplace succeeded
end add word 2
add word 3
begin emplace
new kids 0x1c4ccf0
move constructor 0x1c4ccf0
delete kids 0
emplace succeeded
end add word 3
delete kids 0x1c4cc30

请记住,无论您将移动的对象置于何种状态(例如此处为零 count 和空 kids),它本身都必须有效。因此,您需要小心并进行适当的 if (kids == nullptr) 检查,如果您这样做了。


像这样的案例对于 std::unique_ptr 来说也是一个很好的案例,其中一些对象在一个唯一的地方被创建和销毁,从而节省了手动 delete 的需要。它还会自动阻止默认复制,因为 unique_ptr 本身不允许复制,但允许移动(注意:如果你有一个 ~Node(),你不会自动获得移动功能)。

struct Node {
public:
    Node()
        : count(0)
        , kids(std::make_unique<WordTable>()) // std::unique_ptr(new WordTable())
    {}
    int incrementCount() {
        return ++count;
    }
private:
    int count;
    std::unique_ptr<WordTable> kids;
};

如果我们使用 Valgrind 进行编译和运行,我们会看到一些暗示问题的编译器警告:

g++ -std=c++2a -fPIC -g -Wall -Wextra -Wwrite-strings -Wno-parentheses -Wpedantic -Warray-bounds  -Weffc++       59351752.cpp    -o 59351752
59351752.cpp:23:10: warning: ‘struct WordTable::Node’ has pointer data members [-Weffc++]
   23 |   struct Node {
      |          ^~~~
59351752.cpp:23:10: warning:   but does not override ‘WordTable::Node(const WordTable::Node&)’ [-Weffc++]
59351752.cpp:23:10: warning:   or ‘operator=(const WordTable::Node&)’ [-Weffc++]

以及我们第一次释放后使用的指示:

valgrind --leak-check=full ./59351752   
==1137125== Memcheck, a memory error detector
==1137125== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1137125== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==1137125== Command: ./59351752
==1137125== 
add word 1
begin emplace
emplace succeeded
end add word 1
begin emplace
==1137125== Invalid read of size 8
==1137125==    at 0x10B3D8: std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_begin() const (hashtable.h:384)
==1137125==    by 0x10AFD1: std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::clear() (hashtable.h:2028)
==1137125==    by 0x10ACCD: std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1352)
==1137125==    by 0x10A905: std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, WordTable::Node, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> > >::~unordered_map() (unordered_map.h:102)
==1137125==    by 0x10A94F: WordTable::~WordTable() (59351752.cpp:11)
==1137125==    by 0x10AAA3: WordTable::Node::~Node() (59351752.cpp:30)
==1137125==    by 0x10A9C5: WordTable::addWord(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int) (59351752.cpp:15)
==1137125==    by 0x10A3B2: main (59351752.cpp:52)
==1137125==  Address 0x4d7d228 is 24 bytes inside a block of size 64 free'd
==1137125==    at 0x483708B: operator delete(void*, unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1137125==    by 0x10AAB0: WordTable::Node::~Node() (59351752.cpp:30)
==1137125==    by 0x10CBD9: std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>::~pair() (stl_pair.h:208)
==1137125==    by 0x10CC05: void __gnu_cxx::new_allocator<std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, true> >::destroy<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> >(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>*) (new_allocator.h:153)
==1137125==    by 0x10C58D: void std::allocator_traits<std::allocator<std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, true> > >::destroy<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> >(std::allocator<std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, true> >&, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>*) (alloc_traits.h:497)
==1137125==    by 0x10BC32: std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, true> > >::_M_deallocate_node(std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, true>*) (hashtable_policy.h:2102)
==1137125==    by 0x10B548: std::pair<std::__detail::_Node_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, false, true>, bool> std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_emplace<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, WordTable::Node>(std::integral_constant<bool, true>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, WordTable::Node&&) (hashtable.h:1655)
==1137125==    by 0x10B0B2: std::pair<std::__detail::_Node_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, false, true>, bool> std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::emplace<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, WordTable::Node>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, WordTable::Node&&) (hashtable.h:749)
==1137125==    by 0x10AD2D: std::pair<std::__detail::_Node_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, false, true>, bool> std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, WordTable::Node, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> > >::emplace<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, WordTable::Node>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, WordTable::Node&&) (unordered_map.h:388)
==1137125==    by 0x10A9B9: WordTable::addWord(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int) (59351752.cpp:15)
==1137125==    by 0x10A3B2: main (59351752.cpp:52)
==1137125==  Block was alloc'd at
==1137125==    at 0x4835DEF: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1137125==    by 0x10AA66: WordTable::Node::Node() (59351752.cpp:27)
==1137125==    by 0x10A9A6: WordTable::addWord(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int) (59351752.cpp:15)
==1137125==    by 0x10A3B2: main (59351752.cpp:52)
==1137125== 

这显然是由编译器生成的复制构造函数中复制的 Node 中的原始指针引起的。

如果我们简单地将 Node::kids 更改为 std::unique_ptr 并在离开 main() 之前删除 test,那么我们将得到一个干净的 Valgrind 运行。