为什么使用 setjmp/longjmp 未定义行为？

Question

代码

#include <csetjmp>

template <typename Callable>
void create_checkpoint(std::jmp_buf buf, Callable&& callable)
{
    if (setjmp(buf) != 0)
    {
        callable();
    }
}

#include <iostream>

struct announcer {
    int id;
    announcer(int id):
        id{id}
    {
        std::cout << "created announcer with id " << id << '\n';
    }
    ~announcer() {
        std::cout << "destructing announcer with id " << id << '\n'; 
    }
};

void oopsie(std::jmp_buf buf, bool shouldJump)
{
    if (shouldJump)
    {
        // std::cout << "performing jump...\n";
        std::longjmp(buf, 1);
    }
}

void test1() 
{
    std::jmp_buf buf;
    announcer a1{1};
    create_checkpoint(buf, []() {throw std::exception();});
    oopsie(buf, true);
}

void test2()
{
    std::jmp_buf buf;
    announcer a1{1};
    create_checkpoint(buf, []() {throw std::exception();});
    oopsie(buf, false);


    announcer a2{2};
    create_checkpoint(buf, []() {throw std::exception();});
    oopsie(buf, true);
}

int main()
{
    try 
    {
        test1();
    }
    catch (...)
    {}

    try 
    {
        test2();
    }
    catch (...)
    {}
}

上下文

我必须调用一些通过 longjmp 报告错误的 C 库。为了提供强大的异常保证，我想创建一个与 std::lock_guard 非常相似的函数，例如我只是编写 create_checkpoint(buf, handler) 并继续调用 C 库函数，直到我分配更多资源（如果我理解正确，则不会调用在 setjmp 行下创建的对象的析构函数）。

问题

为什么在这种情况下会调用未定义的行为，我该如何解决？

我怎么发现这是未定义的行为？

Printing message to std::cout before std::longjmp vs not printing 产生非常不同的结果，即使该行与控制流关系不大。

我现在明白了什么？

我理解std::longjmp本质上是恢复寄存器并跳转到setjmp宏保存的指令指针处。还有，functions are not optimized away，至少在编译的时候，有调用longjmp的指令。

正在将 create_checkpoint 转换为宏 seems to solve the issue。但是我想知道有没有更好的方法来做到这一点？

Answer 1

来自https://en.cppreference.com/w/cpp/utility/program/longjmp

If the function that called setjmp has exited, the behavior is undefined (in other words, only long jumps up the call stack are allowed)

由于您没有遵守这条规则，您的程序有未定义的行为

Answer 2

填充 jmp_buff 的代码必须知道在传递给 longjmp 后堆栈的左侧应该有什么。如果 setjmp 被处理为编译器内部函数，它只能在 return int 的函数中使用，编译器可以安排一些事情，以便 longjmp 会导致调用 setjmp 到 "return twice"，而不是将 setjmp 本身视为这样做。然而，在许多实现中，对 setjmp 的调用就像对函数的任何其他调用一样处理，编译器的知识仅限于原型。在这样的实现中，setjmp 无法安排 longjmp return 到调用函数的调用者，而没有关于该函数的堆栈帧的信息。虽然处理 setjmp 调用的编译器会有所需的信息，但没有理由将其提供给 setjmp，并且如果没有这样的编译器，setjmp 将无法获取信息支持。

顺便说一句，setjmp 更令人烦恼的是，虽然它有可能 return 一个值，但 setjmp 调用必须出现在非常狭窄的一组上下文中，none其中可以方便的捕获值returned。理论上可以说：

int setJmpValue;
switch(setjmp(...))
{
  case 0: setJmpValue=0; break;
  case 1: setJmpValue=1; break;
  ...
  case INT_MAX-1: setJmpValue = INT_MAX-1; break;
  case INT_MAX  : setJmpValue = INT_MAX  ; break;
}

但这会很烦人，而且无法导出到函数中。

我认为允许 i = setJmp(...); 应该没有任何困难，其中 i 是静态或自动持续时间的 int，这反过来又使任何成为可能任意使用 returned 值，但标准中未提供此类构造，编译器以可预测的方式处理有用的构造已不再流行，除非标准强制它们这样做。

为什么使用 setjmp/longjmp 未定义行为？

Why is this usage of setjmp/longjmp undefined behavior?

c++

undefined-behavior

setjmp

代码

上下文

问题

我怎么发现这是未定义的行为？

我现在明白了什么？