如何从 C++ 上的分段错误中恢复？

Question

我有一些生产关键代码必须保留运行。

把代码想象成

while (true){
   init();
   do_important_things();  //segfault here
   clean();
}

我不能相信代码没有错误，我需要能够记录问题以供日后调查。

这一次，我确实知道代码中某处抛出了分段错误，我需要至少能够记录该错误，然后重新开始。

阅读 here there are a few solutions, but following each one is a flame-war claiming the solution will actually do more harm than good, with no real explanation. I also found this 我考虑使用的答案，但我不确定它是否适合我的用例。

那么，在 C++ 上从分段错误中恢复的最佳方法是什么？

Answer 1

我建议您创建一个非常小的程序，让您真正安全地监控有问题的程序。如果错误程序以您不喜欢的方式退出，请重新启动程序。

Posix 示例：

#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

#include <cstdio>
#include <iostream>

int main(int argc, char* argv[]) {
    if(argc < 2) {
        std::cerr << "USAGE: " << argv[0] << " program_to_monitor <arguments...>\n";
        return 1;
    }

    while(true) {
        pid_t child = fork();          // create a child process

        if(child == -1) {
            std::perror("fork");
            return 1;
        }

        if(child == 0) {
            execvp(argv[1], argv + 1); // start the buggy program
            perror(argv[1]);           // starting failed
            std::exit(0);              // exit with 0 to not trigger a retry
        }

        // Wait for the buggy program to terminate and check the status
        // to see if it should be restarted.

        if(int wstatus; waitpid(child, &wstatus, 0) != -1) {
            if(WIFEXITED(wstatus)) {
                if(WEXITSTATUS(wstatus) == 0) return 0; // normal exit, terminate

                std::cerr << argv[0] << ": " << argv[1] << " exited with "
                          << WEXITSTATUS(wstatus) << '\n';
            }
            if(WIFSIGNALED(wstatus)) {
                std::cerr << argv[0] << ": " << argv[1]
                          << " terminated by signal " << WTERMSIG(wstatus);
                if(WCOREDUMP(wstatus)) std::cout << " (core dumped)";
                std::cout << '\n';
            }
            std::cout << argv[0] << ": Restarting " << argv[1] << '\n';
        } else {
            std::perror("wait");
            break;
        }
    }
}

如何从 C++ 上的分段错误中恢复？

How to recover from segmentation fault on C++?

c++

error-handling

signals

exception

segmentation-fault