添加一个 int 变量时生成不同的 IL

Question

我在 c# 中有这个程序：

using System;

class Program
{
    public static void Main()
    {
    int i = 4;
    double d = 12.34;
    double PI = Math.PI;
    string name = "Ehsan";


    }
}

当我编译它时，以下是编译器为 Main 生成的 IL：

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  // Code size       30 (0x1e)
  .maxstack  1
  .locals init (int32 V_0,
           float64 V_1,
           float64 V_2,
           string V_3)
  IL_0000:  nop
  IL_0001:  ldc.i4.4
  IL_0002:  stloc.0
  IL_0003:  ldc.r8     12.34
  IL_000c:  stloc.1
  IL_000d:  ldc.r8     3.1415926535897931
  IL_0016:  stloc.2
  IL_0017:  ldstr      "Ehsan"
  IL_001c:  stloc.3
  IL_001d:  ret
} // end of method Program::Main

很好，我明白了，现在如果我添加另一个整数变量，则会生成不同的东西，这是修改后的 C# 代码：

using System;

class Program
{
    public static void Main()
    {
    int unassigned;
    int i = 4;
    unassigned = i;
    double d = 12.34;
        double PI = Math.PI;
    string name = "Ehsan";


    }
}

这里是针对上述 c# 代码生成的 IL：

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  // Code size       33 (0x21)
  .maxstack  1
  .locals init (int32 V_0,
           int32 V_1,
           float64 V_2,
           float64 V_3,
           string V_4)
  IL_0000:  nop
  IL_0001:  ldc.i4.4
  IL_0002:  stloc.1
  IL_0003:  ldloc.1
  IL_0004:  stloc.0
  IL_0005:  ldc.r8     12.34
  IL_000e:  stloc.2
  IL_000f:  ldc.r8     3.1415926535897931
  IL_0018:  stloc.3
  IL_0019:  ldstr      "Ehsan"
  IL_001e:  stloc.s    V_4  // what is happening here in this case
  IL_0020:  ret
} // end of method Program::Main

如果你现在注意到 stloc.s 语句是用 V_4 生成的，这是本地的，但我不清楚这一点，我也不知道这些本地人在这里的目的是什么，我意思是：

 .locals init (int32 V_0,
               float64 V_1,
               float64 V_2,
               string V_3)

Answer 1

一些注意事项。

首先，这大概是一个调试版本，或者至少在编译中关闭了某些优化。我希望在这里看到的是：

.method public hidebysig static void Main () cil managed 
{
  .entrypoint

  IL_0000: ret
}

也就是说，由于未使用这些局部变量，我希望编译器完全跳过它们。它不会出现在调试版本中，但这是一个很好的例子，说明 C# 所说的内容和 IL 所说的内容之间有何显着差异。

接下来要注意的是 IL 方法的结构。您有一个由各种类型的 .locals 块定义的局部值数组。这些通常与 C# 的内容非常接近，尽管通常会有捷径和重新安排。

最后我们有了一组指令，它们都作用于那些局部变量、任何参数，以及它可以压入、弹出的堆栈，以及各种指令将在其上交互的堆栈。

接下来要注意的是，您在这里看到的 IL 是一种字节码汇编：这里的每条指令都一对一映射到一个或两个字节，并且每个值也消耗一个一定数量的字节。因此，例如，stloc V_4（实际上并未出现在您的示例中，但我们会谈到）将映射到 0xFE 0x0E 0x04 0x00，其中 0xFE 0x0E 是 stloc 和 0x04 0x00 4 的那个，这是相关本地的索引。意思是"pop the value of the top of the stack, and store it in the 5th (index 4) local".

现在，这里有一些缩写。其中之一是几条指令的 .s "short" 形式（_S 的名称相当于 System.Reflection.Emit.OpCode 值）。这些是其他指令的变体，它们采用单字节值（有符号或无符号取决于指令），而另一种形式采用两字节或四字节值，通常是索引或要跳转的相对距离。因此，我们可以使用 stloc.s V_4 而不是 stloc V_4，它只有 0x13 0x4，因此更小。

还有一些变体在指令中包含特定值。因此，我们可以只使用 stloc.0 而不是 stloc V_0 或 stloc.s V_0，它只是一个字节 0x0A.

当您认为一次只使用少数本地人很常见时，这很有意义，因此使用 stloc.s 或（更好）stloc.0, stloc.1, 等等) 节省的钱很少，加起来会非常多。

但也就这么多了。如果我们有例如stloc.252、stloc.253等，那么这样的指令就会很多，每条指令需要的字节数就得更多，总的来说是一种损失.局部相关（stloc、ldloc）和参数相关（ldarg）的超短形式最多只能达到3。（有 starg 和 starg.s 但没有 starg.0 等，因为存储到参数的情况相对较少）。 ldc.i4/ldc.i4.s（将一个常量 32 位有符号值压入堆栈）具有从 ldc.i4.0 到 ldc.i4.8 的超短版本以及 lcd.i4.m1 -1.

还值得注意的是，您的代码中根本不存在 V_4。无论您用什么检查 IL，都不知道您使用了变量名 name，所以它只使用了 V_4。（顺便说一句，你在用什么？我大部分时间都使用 ILSpy，如果你要调试与文件相关的信息，它会相应地调用它 name）。

因此，为了生成具有更多可比较名称的注释非缩短版本的方法，我们可以编写以下 CIL：

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  .maxstack  1
  .locals init (int32 unassigned,
           int32 i,
           float64 d,
           float64 PI,
           string name)
  nop                           // Do Nothing (helps debugger to have some of these around).
  ldc.i4   4                    // Push number 4 on stack
  stloc    i                    // Pop value from stack, put in i (i = 4)
  ldloc    i                    // Push value in i on stack
  stloc    unassigned           // Pop value from stack, put in unassigned (unassigned = i)
  ldc.r8   12.34                // Push the 64-bit floating value 12.34 onto the stack
  stloc    d                    // Push the value on stack in d (d = 12.34)
  ldc.r8   3.1415926535897931   // Push the 64-bit floating value 3.1415926535897931 onto the stack.
  stloc PI                      // Pop the value from stack, put in PI (PI = 3.1415… which is the constant Math.PI)
  ldstr    "Ehsan"              // Push the string "Ehsan" on stack
  stloc    name                 // Pop the value from stack, put in name
  ret                           // return.
}

这与您的代码的行为非常相似，但要大一些。因此，我们将 stloc 替换为 stloc.0…stloc.3，在我们不能使用但仍可以使用 stloc.s 的地方，stloc.s，以及 ldc.i4 4 和 ldc.i4.4，我们将得到更短的字节码来做同样的事情：

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  .maxstack  1
  .locals init (int32 unassigned,
           int32 i,
           float64 d,
           float64 PI,
           string name)
  nop                           // Do Nothing (helps debugger to have some of these around).
  ldc.i4.4                      // Push number 4 on stack
  stloc.1                       // Pop value from stack, put in i (i = 4)
  ldloc.1                       // Push value in i on stack
  stloc.0                       // Pop value from stack, put in unassigned (unassigned = i)
  ldc.r8   12.34                // Push the 64-bit floating value 12.34 onto the stack
  stloc.2                       // Push the value on stack in d (d = 12.34)
  ldc.r8   3.1415926535897931   // Push the 64-bit floating value 3.1415926535897931 onto the stack.
  stloc.3                       // Pop the value from stack, put in PI (PI = 3.1415… which is the constant Math.PI)
  ldstr    "Ehsan"              // Push the string "Ehsan" on stack
  stloc.s  name                 // Pop the value from stack, put in name
  ret                           // return.
}

现在我们得到的代码与您的反汇编代码完全相同，只是我们有了更好的名称。请记住，名称不会出现在字节码中，因此反汇编程序无法像我们一样完成工作。

您在评论中提出的问题实际上应该是另一个问题，但它提供了一个机会来添加我在上面仅简要指出的重要内容。让我们考虑一下：

public static void Maybe(int a, int b)
{
  if (a > b)
    Console.WriteLine("Greater");
  Console.WriteLine("Done");
}

在调试中编译，你最终会得到类似的东西：

.method public hidebysig static 
  void Maybe (
    int32 a,
    int32 b
  ) cil managed 
{
  .maxstack 2
  .locals init (
    [0] bool CS[=14=]00
  )

  IL_0000: nop
  IL_0001: ldarg.0
  IL_0002: ldarg.1
  IL_0003: cgt
  IL_0005: ldc.i4.0
  IL_0006: ceq
  IL_0008: stloc.0
  IL_0009: ldloc.0
  IL_000a: brtrue.s IL_0017

  IL_000c: ldstr "Greater"
  IL_0011: call void [mscorlib]System.Console::WriteLine(string)
  IL_0016: nop

  IL_0017: ldstr "Done"
  IL_001c: call void [mscorlib]System.Console::WriteLine(string)
  IL_0021: nop
  IL_0022: ret
}

现在要注意的一件事是，所有的标签，如 IL_0017 等，都是根据指令的索引添加到每一行的。这使反汇编程序的工作更轻松，但除非跳转到标签，否则并不是真正必要的。让我们去掉所有未跳转到的标签：

.method public hidebysig static 
  void Maybe (
    int32 a,
    int32 b
  ) cil managed 
{
  .maxstack 2
  .locals init (
    [0] bool CS[=15=]00
  )

  nop
  ldarg.0
  ldarg.1
  cgt
  ldc.i4.0
  ceq
  stloc.0
  ldloc.0
  brtrue.s IL_0017

  ldstr "Greater"
  call void [mscorlib]System.Console::WriteLine(string)
  nop

  IL_0017: ldstr "Done"
  call void [mscorlib]System.Console::WriteLine(string)
  nop
  ret
}

现在，让我们考虑每一行的作用：

.method public hidebysig static 
  void Maybe (
    int32 a,
    int32 b
  ) cil managed 
{
  .maxstack 2
  .locals init (
    [0] bool CS[=16=]00
  )

  nop                   // Do nothing
  ldarg.0               // Load first argument (index 0) onto stack.
  ldarg.1               // Load second argument (index 1) onto stack.
  cgt                   // Pop two values from stack, push 1 (true) if the first is greater
                        // than the second, 0 (false) otherwise.
  ldc.i4.0              // Push 0 onto stack.
  ceq                   // Pop two values from stack, push 1 (true) if the two are equal,
                        // 0 (false) otherwise.
  stloc.0               // Pop value from stack, store in first local (index 0)
  ldloc.0               // Load first local onto stack.
  brtrue.s IL_0017      // Pop value from stack. If it's non-zero (true) jump to IL_0017

  ldstr "Greater"       // Load string "Greater" onto stack.

                        // Call Console.WriteLine(string)
  call void [mscorlib]System.Console::WriteLine(string)
  nop                   // Do nothing

  IL_0017: ldstr "Done" // Load string "Done" onto stack.
                        // Call Console.WriteLine(string)
  call void [mscorlib]System.Console::WriteLine(string)
  nop                   // Do nothing
  ret                   // return
}

让我们以非常直接的逐步方式将其写回到 C# 中：

public static void Maybe(int a, int b)
{
  bool shouldJump = (a > b) == false;
  if (shouldJump) goto IL_0017;
  Console.WriteLine("Greater");
IL_0017:
  Console.WriteLine("Done");
}

尝试一下，您会发现效果相同。 goto 的使用是因为 CIL 实际上没有 for 或 while 之类的东西，甚至没有我们可以放在 if 或 else 之后的块，它只有跳转和条件跳转。

但为什么要存储值（我在我的 C# 重写中称为 shouldJump）而不只是对其进行操作？

这只是为了在调试时更容易检查每个点发生的情况。特别是，为了使调试器能够在计算出 a > b 但尚未采取行动时停止，则需要存储 a > b 或其对立面 (a <= b)。

出于这个原因，调试版本倾向于编写 CIL，该 CIL 花费大量时间来记录它刚刚执行的操作。通过发布构建，我们会得到更像的东西：

.method public hidebysig static 
  void Maybe (
    int32 a,
    int32 b
  ) cil managed 
{
  ldarg.0           // Load first argument onto stack
  ldarg.1           // Load second argument onto stack
  ble.s IL_000e     // Pop two values from stack. If the first is
                    // less than or equal to the second, goto IL_000e: 
  ldstr "Greater"   // Load string "Greater" onto stack.
                    // Call Console.WriteLine(string)
  call void [mscorlib]System.Console::WriteLine(string)
                    // Load string "Done" onto stack.
  IL_000e: ldstr "Done"
                    // Call Console.WriteLine(string)
  call void [mscorlib]System.Console::WriteLine(string)
  ret
}

或者执行类似的逐行写回 C#：

public static void Maybe(int a, int b)
{
  if (a <= b) goto IL_000e;
  Console.WriteLine("Greater");
IL_000e:
  Console.WriteLine("Done");
}

所以您可以看到发布版本如何更简洁地做同样的事情。

Answer 2

MSIL 经过大量微优化，使存储尽可能小。转到 Opcodes class 并记下列出的 Stloc 说明。它有 6 个版本，它们都做同样的事情。

Stloc_0、Stloc_1、Stloc_2和Stloc_3是最小的，它们只占用一个字节。他们使用的可变数字是隐式的，从 0 到 3。当然很常用。

然后是Stloc_S，它是一个双字节的操作码，第二个字节编码变量号。当一个方法有超过4个变量时需要使用这个。

最后是Stloc，它是一个三字节的操作码，用两个字节对变量号进行编码。当一个方法有超过 256 个变量时必须使用。希望你永远不会那样做。当您编写一个不支持超过 65536 个变量的怪物时，您就不走运了。顺便说一句，自动生成的代码可以突破这个限制。

很容易看出第二个片段中发生了什么，您添加了 unassigned 变量并将局部变量的数量从 4 个增加到 5 个。由于没有 Stloc_4，编译器必须使用Stloc_S给第5个变量赋值。

添加一个 int 变量时生成不同的 IL

Different IL generated when adding one more int variable

c#

il

csc

ildasm