为什么是 str = str.Replace().Replace();比 str = str.Replace(); 快str = str.Replace()?

Question

我正在做一个本地测试，以比较 C# 中 String 和 StringBuilder 的替换操作性能，但对于 String，我使用了以下代码：

String str = "String to be tested. String to be tested. String to be tested."
str = str.Replace("i", "in");
str = str.Replace("to", "ott");
str = str.Replace("St", "Tsr");
str = str.Replace(".", "\n");
str = str.Replace("be", "or be");
str = str.Replace("al", "xd");

但是，在注意到 String.Replace() 比 StringBuilder.Replace() 快之后，我开始针对上面的代码测试以下代码：

String str = "String to be tested. String to be tested. String to be tested."
str = str.Replace("i", "in").Replace("to", "ott").Replace("St", "Tsr").Replace(".", "\n").Replace("be", "or be").Replace("al", "xd");

最后一个结果快了大约 10% 到 15% 倍，关于为什么它更快有什么想法吗？给同一个变量赋值这么贵吗？

Answer 1

我不确定您的第二个代码在幕后到底发生了什么（或者它在后台与第一个代码有何完全不同）。但是，我猜你看到分配给同一个变量比较慢，因为 string 是 immutable.

string 是不可变的意思：即使你为同一个变量分配一个新值，你也在为其分配一个新的内存地址。也就是说，您可以想象为该新值保留一个新变量，并且稍后垃圾收集器将清除第一个值的内存位置。

这里有一个参考：

There is a term called immutable, which means the state of an object can't be changed after is has been created. A string is an immutable type. The statement that a string is immutable means that, once created, it is not altered by changing the value assigned to it. If we try to change the value of a string by concatenation (using + operator) or assign a new value to it, it actually results in creation of a new string object to hold a reference to the newly generated string. It might seem that we have successfully altered the existing string. But behind the scenes, a new string reference is created, which points to the newly created string.

https://www.c-sharpcorner.com/UploadFile/b1df45/string-is-immutable-in-C-Sharp/

再说一次，这是我的猜测，如果有人看到我错了，请发表评论。

Answer 2

我做了这个基准：

namespace StringReplace
{
    using BenchmarkDotNet.Attributes;
    using BenchmarkDotNet.Running;

    public class Program
    {
        static void Main(string[] args)
        {
            BenchmarkRunner.Run<Program>();
        }

        private String str = "String to be tested. String to be tested. String to be tested.";

        [Benchmark]
        public string Test1()
        {
            var a = str;
            a = a.Replace("i", "in");
            a = a.Replace("to", "ott");
            a = a.Replace("St", "Tsr");
            a = a.Replace(".", "\n");
            a = a.Replace("be", "or be");
            a = a.Replace("al", "xd");

            return a;
        }

        [Benchmark]
        public string Test2()
        {
            var a = str;
            a = a.Replace("i", "in").Replace("to", "ott").Replace("St", "Tsr").Replace(".", "\n").Replace("be", "or be").Replace("al", "xd");

            return a;
        }
    }
}

结果：

BenchmarkDotNet=v0.10.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-7700 CPU 3.60GHz, ProcessorCount=8
Frequency=3515629 Hz, Resolution=284.4441 ns, Timer=TSC
Host Runtime=Clr 4.0.30319.42000, Arch=32-bit RELEASE
GC=Concurrent Workstation
JitModules=clrjit-v4.7.2600.0
Job Runtime(s):
    Clr 4.0.30319.42000, Arch=32-bit RELEASE


 Method |      Mean |    StdDev |    Median |
------- |---------- |---------- |---------- |
  Test1 | 1.3768 us | 0.0354 us | 1.3704 us |
  Test2 | 1.3941 us | 0.0325 us | 1.3778 us |

如您所见，在 Release 模式下结果是相同的。所以，我认为由于变量的过度分配，调试模式可能会有很小的差异。但是在发布模式下编译器可以优化它。

Answer 3

简答

您似乎是在调试配置中进行编译。因为编译器需要保证源代码的每条语句都可以设置断点，多次赋值到本地的摘录效率较低。

如果您在 Release 配置中编译，以不让您设置断点为代价优化代码生成，则两个摘录都编译为相同的中间代码，因此应该具有相同的性能。

请注意，您是在调试配置还是发布配置中编译与您是否使用调试器 (F5) 或不使用调试器 (Ctrl + F5) 从 Visual Studio 启动应用程序并不一定相关。有关详细信息，请参阅。

长答案

C# 编译为 .NET 中间语言（IL、MSIL 或 CIL）。 .NET SDK 附带了一个工具 IL Disassembler，它可以向我们展示这种中间语言以更好地理解差异。请注意，.NET 运行时 (VES) 是一个堆栈机器 - 而不是寄存器，IL 在 "operand stack" 上运行，在该 "operand stack" 上推送和拉取值。这个问题的性质不是太重要，但要知道计算堆栈是存储临时值的地方。

反汇编第一个摘录，我在没有设置 "optimize code" 选项的情况下编译（即，我使用调试配置编译），显示如下代码：

  .locals init ([0] string str)
  IL_0000:  nop
  IL_0001:  ldstr      "String to be tested. String to be tested. String t" + "o be tested."
  IL_0006:  stloc.0
  IL_0007:  ldloc.0
  IL_0008:  ldstr      "i"
  IL_000d:  ldstr      "in"
  IL_0012:  callvirt   instance string [mscorlib]System.String::Replace(string, string)
  IL_0017:  stloc.0
  IL_0018:  ldloc.0
  IL_0019:  ldstr      "to"
  IL_001e:  ldstr      "ott"
  IL_0023:  callvirt   instance string [mscorlib]System.String::Replace(string, string)

该方法有一个局部变量，str。简而言之，摘录：

在计算堆栈 (ldstr) 上创建 "String to be tested..." 字符串。
将字符串存储到本地 (stloc.0)，导致计算堆栈为空。
将该值从本地 (ldloc.0) 加载回堆栈。
用另外两个字符串 "i" 和 "in"（两个 ldstr 和 callvirt）对加载值调用 Replace，导致仅包含结果字符串的评估堆栈。
将结果存储回本地 (stloc.0)，导致计算堆栈为空。
从本地 (ldloc.0) 加载该值。
使用另外两个字符串 "to" 和 "ott"（两个 ldstr 和 callvirt）对加载值调用 Replace。

依此类推。

与第二个摘录相比，也是在没有 "optimized code" 的情况下编译的：

  .locals init ([0] string str)
  IL_0000:  nop
  IL_0001:  ldstr      "String to be tested. String to be tested. String t" + "o be tested."
  IL_0006:  stloc.0
  IL_0007:  ldloc.0
  IL_0008:  ldstr      "i"
  IL_000d:  ldstr      "in"
  IL_0012:  callvirt   instance string [mscorlib]System.String::Replace(string, string)
  IL_0017:  ldstr      "to"
  IL_001c:  ldstr      "ott"
  IL_0021:  callvirt   instance string [mscorlib]System.String::Replace(string, string)

在第 4 步之后，评估堆栈具有第一个 Replace 调用的结果。因为本例中的 C# 代码没有将这个中间值分配给 str 变量，所以 IL 可以避免存储和重新加载该值，而只是重新使用已经在计算堆栈上的结果。 这会跳过第 5 步和第 6 步，从而使代码的性能略有提高。

但是等等，编译器肯定知道这些摘录是等价的，对吧？为什么它不总是产生第二组更有效的 IL 指令集？ 因为我编译时没有优化。因此，编译器假定我需要能够在每个 C# 语句上设置断点。在断点处，局部变量需要处于一致状态，计算堆栈需要为空。这就是第一个摘录有步骤 5 和 6 的原因——这样调试器可以在这些步骤之间的断点处停止，我将看到 str 局部变量具有我在该行上期望的值。

如果我编译这些摘录并进行优化（例如，我使用 Release 配置编译），那么编译器确实会为每个代码生成相同的代码：

  // no .locals directive
  IL_0000:  ldstr      "String to be tested. String to be tested. String t" + "o be tested."
  IL_0005:  ldstr      "i"
  IL_000a:  ldstr      "in"
  IL_000f:  callvirt   instance string [mscorlib]System.String::Replace(string,strin g)
  IL_0014:  ldstr      "to"
  IL_0019:  ldstr      "ott"
  IL_001e:  callvirt   instance string [mscorlib]System.String::Replace(string, string)

既然编译器知道我无法设置断点，它就可以完全放弃使用局部变量，而让整组操作都发生在计算堆栈上。因此，它可以跳过步骤 2、3、5 和 6，从而进一步优化代码。

为什么是 str = str.Replace().Replace();比 str = str.Replace(); 快str = str.Replace()?

Why is str = str.Replace().Replace(); faster than str = str.Replace(); str = str.Replace()?

c#

string

replace

inline

assign

简答

长答案