Windows 中崩溃进程的可预测退出代码

Predictable exit code of crashed process in Windows

对于在Windows中正常退出的进程,进程的退出代码一般是来自main的return值,或者传递给[=14=的退出代码]. %ERRORLEVEL% 然后可用于查询退出代码,这可用于确定程序是否正确执行,或者是否有一些异常 inputs/failures 表明特定问题(特定于应用程序)。

但是,我对进程崩溃时的退出代码很感兴趣。举个很简单的例子程序:

int main()
{
    int * a = nullptr;
    *a = 0xBAD;
    return 0;
}

当我在 Windows 中编译这个和 运行 时,在命令行上我得到:

MyCrashProgram.exe -> crashes
echo %ERRORLEVEL%  -> -1073741819

退出代码一直是这个数字。这引出了几个问题:

注意,我对如何修改程序来捕获异常不感兴趣。我有兴趣对现有程序中可能发生的崩溃进行分类,我可能无法修改。

这绝不是一个全面的答案,而是一些提示,以便您可以继续前进。

我认为没有办法自动区分所有可能的崩溃原因。为此,您必须自己捕获错误并提供自己的退出代码

为了捕获所有可能的(可捕获的)错误,您必须同时设置异常和信号处理程序。 这是因为访问冲突是 windows 下的异常和 linux 下的信号 (SIGSEV)。

有关 windows 上不同类型错误的详细信息,请参阅此问题: Catching access violation exceptions

这是 signal handling on linux

的另一个话题

关于 STATUS_ACCESS_VIOLATION 的评论让我找到了关于 GetExceptionCode 的文档:

The return value identifies the type of exception. The following table identifies the exception codes that can occur due to common programming errors. These values are defined in WinBase.h and WinNT.h.

EXCEPTION_ACCESS_VIOLATION 映射到后面列表中的 STATUS_ACCESS_VIOLATION。以STATUS为前缀的列表中的所有异常都直接定义为以EXCEPTION为前缀的异常代码。按照RaiseException的文档,它解释了发生异常时尝试调试的过程,最后一步是:

If the process is not being debugged, or if the associated debugger does not handle the exception, the system provides default handling based on the exception type. For most exceptions, the default action is to call the ExitProcess function.

所以回答我的问题:

  • 是的,退出代码是可预测的,它映射到 EXCEPTION_STATUS_VIOLATION
  • 其他类型的错误将映射到其他常见异常代码。但是,通过使用任意异常代码(未处理)调用 RaiseException,进程的退出代码可以是任何东西
  • 退出代码取决于 Windows SDK,而不是编译器,执行 Windows 版本或架构。虽然这在理论上可能会随着更新的 Windows SDK 发生变化,但这不太可能实现向后兼容性。

这是一个相关的short blog post by Raymond Chen(强调我的):

There is no standard for process exit codes. You can pass anything you want to Exit­Process, and that's what Get­Exit­Code­Process will give back. The kernel does no interpretation of the value. If youw want code 42 to mean "Something infinitely improbable has occurred" then more power to you.

There is a convention, however, that an exit code of zero means success (though what constitutes "success" is left to the discretion of the author of the program) and a nonzero exit code means failure (again, with details left to the discretion of the programmer). Often, higher values for the exit code indicate more severe types of failure. The command processor ERROR­LEVEL keyword was designed with these convention in mind.

There are cases where your process will get in such a bad state that a component will take it upon itself to terminate the process. For example, if a process cannot locate the DLLs it imports from, or one of those DLLs fails to initialize, the loader will terminate the process and use the status code as the process exit code. I believe that when a program crashes due to an unhandled exception, the exception code is used as the exit code.

A customer was seeing their program crash with an exit code of 3 and couldn't figure out where it was coming from. They never use that exit code in their program. Eventually, the source of the magic number 3 was identified: The C runtime abort function terminates the process with exit code 3.