Windows 中崩溃进程的可预测退出代码
Predictable exit code of crashed process in Windows
对于在Windows中正常退出的进程,进程的退出代码一般是来自main
的return值,或者传递给[=14=的退出代码]. %ERRORLEVEL%
然后可用于查询退出代码,这可用于确定程序是否正确执行,或者是否有一些异常 inputs/failures 表明特定问题(特定于应用程序)。
但是,我对进程崩溃时的退出代码很感兴趣。举个很简单的例子程序:
int main()
{
int * a = nullptr;
*a = 0xBAD;
return 0;
}
当我在 Windows 中编译这个和 运行 时,在命令行上我得到:
MyCrashProgram.exe -> crashes
echo %ERRORLEVEL% -> -1073741819
退出代码一直是这个数字。这引出了几个问题:
- 退出代码
-1073741819
是否可以基于无效写入崩溃以某种方式预测?
- 如果是这样,是否有某种方法可以根据退出代码确定崩溃的类型?
- 这会随着使用的编译器而改变吗(我使用的是 MSVC 2012)?
- 这是否会随着使用的 Windows 版本而改变(我使用的是 Win10 TP)?
- 这会随体系结构而改变吗(例如 x64 - 我使用的是 Win32)?
注意,我对如何修改程序来捕获异常不感兴趣。我有兴趣对现有程序中可能发生的崩溃进行分类,我可能无法修改。
这绝不是一个全面的答案,而是一些提示,以便您可以继续前进。
我认为没有办法自动区分所有可能的崩溃原因。为此,您必须自己捕获错误并提供自己的退出代码
为了捕获所有可能的(可捕获的)错误,您必须同时设置异常和信号处理程序。
这是因为访问冲突是 windows 下的异常和 linux 下的信号 (SIGSEV)。
有关 windows 上不同类型错误的详细信息,请参阅此问题:
Catching access violation exceptions
的另一个话题
关于 STATUS_ACCESS_VIOLATION
的评论让我找到了关于 GetExceptionCode
的文档:
The return value identifies the type of exception. The following table identifies the exception codes that can occur due to common programming errors. These values are defined in WinBase.h and WinNT.h.
EXCEPTION_ACCESS_VIOLATION
映射到后面列表中的 STATUS_ACCESS_VIOLATION
。以STATUS
为前缀的列表中的所有异常都直接定义为以EXCEPTION
为前缀的异常代码。按照RaiseException
的文档,它解释了发生异常时尝试调试的过程,最后一步是:
If the process is not being debugged, or if the associated debugger does not handle the exception, the system provides default handling based on the exception type. For most exceptions, the default action is to call the ExitProcess function.
所以回答我的问题:
- 是的,退出代码是可预测的,它映射到
EXCEPTION_STATUS_VIOLATION
。
- 其他类型的错误将映射到其他常见异常代码。但是,通过使用任意异常代码(未处理)调用 RaiseException,进程的退出代码可以是任何东西
- 退出代码取决于 Windows SDK,而不是编译器,执行 Windows 版本或架构。虽然这在理论上可能会随着更新的 Windows SDK 发生变化,但这不太可能实现向后兼容性。
这是一个相关的short blog post by Raymond Chen(强调我的):
There is no standard for process exit codes. You can pass anything you
want to ExitProcess, and that's what GetExitCodeProcess will give
back. The kernel does no interpretation of the value. If youw want
code 42 to mean "Something infinitely improbable has occurred" then
more power to you.
There is a convention, however, that an exit code of zero means
success (though what constitutes "success" is left to the discretion
of the author of the program) and a nonzero exit code means failure
(again, with details left to the discretion of the programmer). Often,
higher values for the exit code indicate more severe types of failure.
The command processor ERRORLEVEL keyword was designed with these
convention in mind.
There are cases where your process will get in such a bad state that a
component will take it upon itself to terminate the process. For
example, if a process cannot locate the DLLs it imports from, or one
of those DLLs fails to initialize, the loader will terminate the
process and use the status code as the process exit code. I believe
that when a program crashes due to an unhandled exception, the
exception code is used as the exit code.
A customer was seeing their program crash with an exit code of 3 and
couldn't figure out where it was coming from. They never use that exit
code in their program. Eventually, the source of the magic number 3
was identified: The C runtime abort function terminates the process
with exit code 3.
对于在Windows中正常退出的进程,进程的退出代码一般是来自main
的return值,或者传递给[=14=的退出代码]. %ERRORLEVEL%
然后可用于查询退出代码,这可用于确定程序是否正确执行,或者是否有一些异常 inputs/failures 表明特定问题(特定于应用程序)。
但是,我对进程崩溃时的退出代码很感兴趣。举个很简单的例子程序:
int main()
{
int * a = nullptr;
*a = 0xBAD;
return 0;
}
当我在 Windows 中编译这个和 运行 时,在命令行上我得到:
MyCrashProgram.exe -> crashes
echo %ERRORLEVEL% -> -1073741819
退出代码一直是这个数字。这引出了几个问题:
- 退出代码
-1073741819
是否可以基于无效写入崩溃以某种方式预测? - 如果是这样,是否有某种方法可以根据退出代码确定崩溃的类型?
- 这会随着使用的编译器而改变吗(我使用的是 MSVC 2012)?
- 这是否会随着使用的 Windows 版本而改变(我使用的是 Win10 TP)?
- 这会随体系结构而改变吗(例如 x64 - 我使用的是 Win32)?
注意,我对如何修改程序来捕获异常不感兴趣。我有兴趣对现有程序中可能发生的崩溃进行分类,我可能无法修改。
这绝不是一个全面的答案,而是一些提示,以便您可以继续前进。
我认为没有办法自动区分所有可能的崩溃原因。为此,您必须自己捕获错误并提供自己的退出代码
为了捕获所有可能的(可捕获的)错误,您必须同时设置异常和信号处理程序。 这是因为访问冲突是 windows 下的异常和 linux 下的信号 (SIGSEV)。
有关 windows 上不同类型错误的详细信息,请参阅此问题: Catching access violation exceptions
的另一个话题关于 STATUS_ACCESS_VIOLATION
的评论让我找到了关于 GetExceptionCode
的文档:
The return value identifies the type of exception. The following table identifies the exception codes that can occur due to common programming errors. These values are defined in WinBase.h and WinNT.h.
EXCEPTION_ACCESS_VIOLATION
映射到后面列表中的 STATUS_ACCESS_VIOLATION
。以STATUS
为前缀的列表中的所有异常都直接定义为以EXCEPTION
为前缀的异常代码。按照RaiseException
的文档,它解释了发生异常时尝试调试的过程,最后一步是:
If the process is not being debugged, or if the associated debugger does not handle the exception, the system provides default handling based on the exception type. For most exceptions, the default action is to call the ExitProcess function.
所以回答我的问题:
- 是的,退出代码是可预测的,它映射到
EXCEPTION_STATUS_VIOLATION
。 - 其他类型的错误将映射到其他常见异常代码。但是,通过使用任意异常代码(未处理)调用 RaiseException,进程的退出代码可以是任何东西
- 退出代码取决于 Windows SDK,而不是编译器,执行 Windows 版本或架构。虽然这在理论上可能会随着更新的 Windows SDK 发生变化,但这不太可能实现向后兼容性。
这是一个相关的short blog post by Raymond Chen(强调我的):
There is no standard for process exit codes. You can pass anything you want to ExitProcess, and that's what GetExitCodeProcess will give back. The kernel does no interpretation of the value. If youw want code 42 to mean "Something infinitely improbable has occurred" then more power to you.
There is a convention, however, that an exit code of zero means success (though what constitutes "success" is left to the discretion of the author of the program) and a nonzero exit code means failure (again, with details left to the discretion of the programmer). Often, higher values for the exit code indicate more severe types of failure. The command processor ERRORLEVEL keyword was designed with these convention in mind.
There are cases where your process will get in such a bad state that a component will take it upon itself to terminate the process. For example, if a process cannot locate the DLLs it imports from, or one of those DLLs fails to initialize, the loader will terminate the process and use the status code as the process exit code. I believe that when a program crashes due to an unhandled exception, the exception code is used as the exit code.
A customer was seeing their program crash with an exit code of 3 and couldn't figure out where it was coming from. They never use that exit code in their program. Eventually, the source of the magic number 3 was identified: The C runtime abort function terminates the process with exit code 3.