在 windbg 中调试时如何从异常对象获取完整堆栈跟踪?

How to get full stack trace from exception object, when debugging in windbg?

我有一个从 IIS 中的应用程序池崩溃中转储的转储文件。一个线程包含 System.NullReferenceException。我能够在转储中找到 NullReferenceException,但无法查看 _stackTrace

0:070> !wdo 00000009cea474b8

Address: 00000009cea474b8
Method Table/Token: 00007ffb53d2dc88/200011704 
Class Name: System.NullReferenceException
Size : 160
EEClass: 00007ffb53ea3e88
Instance Fields: 19
Static Fields: 0
Total Fields: 19
Heap/Generation: 10/0
Module: 00007ffb53d10000
Assembly: 0000000c5a934170
Domain: 00007ffb56324100
Assembly Name: C:\Windows\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
Inherits: System.SystemException System.Exception System.Object (00007FFB53D30660 00007FFB53DB5D20 00007FFB53DB5F88)
00007ffb53db5b70                                    System.String +0000                               _className 0000000751fec958 System.NullReferenceException
00007ffb53dbd6b8                     System.Reflection.MethodBase +0008                         _exceptionMethod 000000074ffc4608
00007ffb53db5b70                                    System.String +0010                   _exceptionMethodString 0000000000000000 (null)
00007ffb53db5b70                                    System.String +0018                                 _message 000000084c638270 オブジェクト参照がオブジェクト インスタンスに設定されていません。
00007ffb53d2de78                   System.Collections.IDictionary +0020                                    _data 0000000751fd4340
00007ffb53db5d20                                 System.Exception +0028                          _innerException 0000000000000000
00007ffb53db5b70                                    System.String +0030                                 _helpURL 0000000000000000 (null)
00007ffb53db5f88                                    System.Object +0038                              _stackTrace 000000074ff7ad00 0a 00 00 00 00 00 00 00 a0 3e 09 6c 0c 00 00 00 c5 44 70 f8 fa 7f 00 00 40 04 84 5b 0c 00 00 00  .........>.l.....Dp....@..[.... (...more...)
00007ffb53db5f88                                    System.Object +0040                           _watsonBuckets 000000074ff7aeb0 01 00 00 00 43 00 4c 00 52 00 32 00 30 00 72 00 33 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ....C.L.R.2.0.r.3............... (...more...)
00007ffb53db5b70                                    System.String +0048                        _stackTraceString 0000000000000000 (null)
00007ffb53db5b70                                    System.String +0050                  _remoteStackTraceString 0000000000000000 (null)
00007ffb53db5f88                                    System.Object +0058                          _dynamicMethods 0000000000000000
00007ffb53db5b70                                    System.String +0060                                  _source 0000000752028c20 System.Web
00007ffb53d495c8         System.Runtime.Serialization.SafeSeriali +0068                _safeSerializationManager 00000009cea47610
00007ffb53dbde60                                    System.IntPtr +0070                                   _xptrs 0 (0n0)
00007ffb53d99290                                   System.UIntPtr +0078                      _ipForWatsonBuckets 0 (0n0)
00007ffb53db80f8                                     System.Int32 +0080                        _remoteStackIndex 0 (0n0)
00007ffb53db80f8                                     System.Int32 +0084                                 _HResult 80004003 (0n-2147467261)
00007ffb53db80f8                                     System.Int32 +0088                                   _xcode e0434352 (0n-532462766)

如何从 _stackTrace 获取字符串中的完整堆栈? 另外,我可以采取什么步骤来找出 NullReferenceException 的来源?

TLDR;

您可以使用 !pe 显示抛出异常的堆栈跟踪。在可能超过 90% 的情况下,这就是您所需要的。

简单而完整的例子

这是一个最小的可重现示例:

using System;

namespace ConsoleNetFramework
{
    class Program
    {
        static void Main()
        {
            throw new ApplicationException("An exception");
        }
    }
}

和调试会话:

0:000> .loadby sos clr
0:000> !pe
Exception object: 0335246c
Exception type:   System.ApplicationException
Message:          An exception
InnerException:   <none>
StackTrace (generated):
    SP       IP       Function
    012FEFB0 01720897 ConsoleNetFramework!ConsoleNetFramework.Program.Main()+0x4f

StackTraceString: <none>
HResult: 80131600

复杂的例子

因为我认为答案太简单了,我觉得可以通过向示例中添加更多很少见的情况来使其更有趣,如下所示:

static void Main()
{
    var b = new BadImageFormatException("Bad exception");
    ApplicationException a;
    try
    {
        throw new ApplicationException("An exception");
    }
    catch (ApplicationException aex)
    {
        a = aex;
    }

    var t = new Thread(ThrowAnotherException);
    t.Start();
    throw new DataException("Data exception");

    Console.WriteLine(b.Message);
    Console.WriteLine(a.Message);
}

private static void ThrowAnotherException()
{
    throw new ConfigurationException("Config exception");
}

最后我们将在内存中有 4 个异常对象:

  • A...Exception
  • B...Exception
  • C...Exception
  • D...Exception

我们可以看看所有这些。

调试会话是:

0:000> .loadby sos clr
0:000> !pe
Exception object: 02eb2ca8
Exception type:   System.Data.DataException
Message:          Data exception
[...]

到目前为止一切正常,我们在抛出的第一个异常处停止,我们可以使用 !pe 查看该异常。

如果我们切换到另一个线程,我们可以看到异常在线程列表中用 # 表示,只要它没有被 .(当前线程)覆盖.

0:000> ~1s
0:001> ~
#  0  Id: 28a8.501c Suspend: 1 Teb: 00cc6000 Unfrozen
.  1  Id: 28a8.1044 Suspend: 1 Teb: 00cc9000 Unfrozen
[...]

这是一个有趣的部分:让我们冻结线程 0 和 运行 应用程序,以便引发第二个异常。

0:001> ~0f
0:001> g
System 0: 1 of 7 threads are frozen
WARNING: Continuing a non-continuable exception
System 0: 1 of 8 threads were frozen
System 0: 1 of 8 threads are frozen
(28a8.4228): CLR exception - code e0434352 (!!! second chance !!!)
System 0: 1 of 8 threads were frozen
[...]

现在我们有另一个导致调试器停止的异常。

0:006> ~1s
[...]
0:001> ~
   0  Id: 28a8.501c Suspend: 1 Teb: 00cc6000 Frozen  
.  1  Id: 28a8.1044 Suspend: 1 Teb: 00cc9000 Unfrozen
   2  Id: 28a8.1a2c Suspend: 1 Teb: 00ccc000 Unfrozen
   3  Id: 28a8.5cfc Suspend: 1 Teb: 00ccf000 Unfrozen
   4  Id: 28a8.490 Suspend: 1 Teb: 00cd2000 Unfrozen
   5  Id: 28a8.2b60 Suspend: 1 Teb: 00cd5000 Unfrozen
#  6  Id: 28a8.4228 Suspend: 1 Teb: 00cd8000 Unfrozen
[...]

那个异常发生在线程 6 上。我们还可以看到我们在该列表中只有一个 # 标记,而不是两个。

在正确的线程上,我们可以打印异常:

0:006> !pe
Exception object: 02eb3ff4
Exception type:   System.Configuration.ConfigurationException
Message:          Config exception
[...]

在错误的线程上,我们不能:

0:006> ~1s
[...]
0:001> !pe
The current thread is unmanaged

经验教训:!pe 命令对线程敏感。

现在,其他两个例外情况如何,A...B...?我们可以在堆上找到它们:

0:000> !dumpheap -type BadImageFormatException
 Address       MT     Size
02eb2a6c 6e3983c4       92   
[...]
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
[...]
6e342734  40002ab       20        System.Object  0 instance 00000000 _stackTrace
[...]

根本没有抛出的异常,没有堆栈跟踪。这是 null.

0:000> !dumpheap -type ApplicationException
 Address       MT     Size
02eb2ae4 6e388090       84     

Statistics:
      MT    Count    TotalSize Class Name
6e388090        1           84 System.ApplicationException
Total 1 objects
0:000> !DumpObj /d 02eb2ae4
Name:        System.ApplicationException
[...]
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
[...]
6e342734  40002ab       20        System.Object  0 instance 02eb2b7c _stackTrace
[...]

被抛出但被捕获的异常,仍然有它的堆栈跟踪。

0:000> !pe 02eb2ae4
Exception object: 02eb2ae4
Exception type:   System.ApplicationException
Message:          An exception
InnerException:   <none>
StackTrace (generated):
    SP       IP       Function
    00EFEEA0 052F08D2 ConsoleNetFramework!ConsoleNetFramework.Program.Main()+0x8a

StackTraceString: <none>
HResult: 80131600

经验教训:!pe 可以将任意异常作为参数并打印异常及其调用堆栈。

查看托管 !threads 的输出,将列出两个异常(不像 ~ 只显示一个 #):

0:001> !threads
ThreadCount:      3
UnstartedThread:  0
BackgroundThread: 1
PendingThread:    0
DeadThread:       0
Hosted Runtime:   no
                                                                         Lock  
       ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt Exception
   0    1 501c 00fb99c0     2a020 Preemptive  02EC0714:00000000 00f815a0 0     MTA System.Data.DataException 02eb2ca8
   5    2 2b60 00fc4728     2b220 Preemptive  00000000:00000000 00f815a0 0     MTA (Finalizer) 
   6    3 4228 00fec9a0     2b020 Preemptive  02EC5B70:00000000 00f815a0 0     MTA System.Configuration.ConfigurationException 02eb3ff4

what steps can I take to find out the origin of NullReferenceException?

通过调用堆栈和符号,您应该能够识别单行代码。下一步是在那里进行代码审查。找出哪个变量是 null。编写单元测试以重现该问题。然后修复错误。就像在 red/green 重构中一样。