即使似乎满足所有条件,JIT 编译器也不会内联方法

Method is not inlined by the JIT compiler even though all criteria seems to be met

背景

在编写用于解析某些文本的 class 时,我需要能够获取特定字符位置的行号(换句话说,计算该字符之前出现的所有换行符)。

为了找到实现此目的的最有效代码,我设置了几个基准测试,结果表明 Regex 是最慢的方法,而手动迭代字符串是最快的。

以下是我目前的方法(10k 次迭代:278 毫秒):

private string text;

/// <summary>
/// Returns whether the specified character index is the end of a line.
/// </summary>
/// <param name="index">The index to check.</param>
/// <returns></returns>
private bool IsEndOfLine(int index)
{
    //Matches "\r" and "\n" (but not "\n" if it's preceded by "\r").
    char c = text[index];
    return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
}

/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumber(int index)
{
    if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }

    int lineNumber = 1;
    int end = index;

    index = 0;
    while(index < end) {
        if(IsEndOfLine(index)) lineNumber++;
        index++;
    }

    return lineNumber;
}

然而,在做这些基准测试时,我记得方法调用有时会有点昂贵,所以我决定尝试将条件从 IsEndOfLine() 直接移动到 if 语句中 GetLineNumber() 还有。

如我所料,执行速度快两倍以上(10k 次迭代:112 毫秒):

while(index < end) {
    char c = text[index];
    if(c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'))) lineNumber++;
    index++;
}

问题

据我所知,JIT 编译器没有(或者至少没有)优化大小超过 32 字节的 IL 代码[1] unless [MethodImplAttribute(MethodImplOptions.AggressiveInlining)] is specified[2]。但是尽管将此属性应用于 IsEndOfLine(),但似乎没有发生内联。

我能找到的大部分关于此的讨论都来自较早的 posts/articles。在最新的一个([2] 从 2012 年开始)中,作者显然使用 MethodImplOptions.AggressiveInlining 成功地内联了一个 34 字节的函数,这意味着如果满足所有其他条件,该标志允许内联更大的 IL 代码。

使用以下代码测量我的方法的大小表明它有 54 个字节长:

Console.WriteLine(this.GetType().GetMethod("IsEndOfLine").GetMethodBody().GetILAsByteArray().Length);

在 VS 2019 中使用 Dissasembly window 显示以下 IsEndOfLine() 的汇编代码(C# 源代码在 查看选项):

(配置:Release (x86),禁用Just My Code and Suppress JIT optimization on module load)

--- [PATH REMOVED]\Performance Test - Find text line number\TextParser.cs 
    28:             char c = text[index];
001E19BA  in          al,dx  
001E19BB  mov         eax,dword ptr [ecx+4]  
001E19BE  cmp         edx,dword ptr [eax+4]  
001E19C1  jae         001E19FF  
001E19C3  movzx       eax,word ptr [eax+edx*2+8]  
    29:             return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
001E19C8  cmp         eax,0Dh  
001E19CB  je          001E19F8  
001E19CD  cmp         eax,0Ah  
001E19D0  jne         001E19F4  
001E19D2  test        edx,edx  
001E19D4  je          001E19ED  
001E19D6  dec         edx  
001E19D7  mov         eax,dword ptr [ecx+4]  
001E19DA  cmp         edx,dword ptr [eax+4]  
001E19DD  jae         001E19FF  
001E19DF  cmp         word ptr [eax+edx*2+8],0Dh  
001E19E5  setne       al  
001E19E8  movzx       eax,al  
001E19EB  pop         ebp  
001E19EC  ret  
001E19ED  mov         eax,1  
001E19F2  pop         ebp  
001E19F3  ret  
001E19F4  xor         eax,eax  
001E19F6  pop         ebp  
001E19F7  ret  
001E19F8  mov         eax,1  
001E19FD  pop         ebp  
001E19FE  ret  
001E19FF  call        70C2E2B0  
001E1A04  int         3  

...以及 GetLineNumber() 中循环的以下代码:

    63:             index = 0;
001E1950  xor         esi,esi  
    64:             while(index < end) {
001E1952  test        ebx,ebx  
001E1954  jle         001E196C  
001E1956  mov         ecx,edi  
001E1958  mov         edx,esi  
001E195A  call        dword ptr ds:[144E10h]  
001E1960  test        eax,eax  
001E1962  je          001E1967  
    65:                 if(IsEndOfLine(index)) lineNumber++;
001E1964  inc         dword ptr [ebp-10h]  
    66:                 index++;
001E1967  inc         esi  
    64:             while(index < end) {
001E1968  cmp         esi,ebx  
001E196A  jl          001E1956  
    67:             }
    68: 
    69:             return lineNumber;
001E196C  mov         eax,dword ptr [ebp-10h]  
001E196F  pop         ecx  
001E1970  pop         ebx  
001E1971  pop         esi  
001E1972  pop         edi  
001E1973  pop         ebp  
001E1974  ret  

我不太擅长阅读汇编代码,但在我看来似乎没有发生内联。

问题

为什么即使指定了 MethodImplOptions.AggressiveInlining,JIT 编译器也不内联我的 IsEndOfLine() 方法?我知道这个标志只是对编译器的一个提示,但基于 [2] 应用它应该可以内联 IL 大于 大于 32 字节。除此之外,对我来说,我的代码似乎满足所有其他条件。

我还缺少其他类型的限制吗?

基准

结果:

Text length: 11645

Line: 201
Standard loop: 00:00:00.2779946 (10000 à 00:00:00.0000277)

Line: 201
Standard loop (inline): 00:00:00.1122908 (10000 à 00:00:00.0000112)

<为简洁起见,基准代码已移至 >

脚注

1 To Inline or not to Inline: That is the question

2 Aggressive Inlining in the CLR 4.5 JIT


-- 编辑--

出于某种原因,在重新启动 VS、启用和重新禁用之前提到的设置以及重新应用 MethodImplOptions.AggressiveInlining 之后,该方法现在似乎是内联的。但是,它添加了一些当您手动内联 if 条件时不存在的指令。

JIT 优化版本:

    66:             while(index < end) {
001E194B  test        ebx,ebx  
001E194D  jle         001E1998  
001E194F  mov         esi,dword ptr [ecx+4]  
    67:                 if(IsEndOfLine(index)) lineNumber++;
001E1952  cmp         edx,esi  
001E1954  jae         001E19CA  
001E1956  movzx       eax,word ptr [ecx+edx*2+8]  

001E195B  cmp         eax,0Dh  
001E195E  je          001E1989  
001E1960  cmp         eax,0Ah  
001E1963  jne         001E1985  
001E1965  test        edx,edx  
001E1967  je          001E197E  
001E1969  mov         eax,edx  
001E196B  dec         eax  
001E196C  cmp         eax,esi  
001E196E  jae         001E19CA  
001E1970  cmp         word ptr [ecx+eax*2+8],0Dh  
001E1976  setne       al  
001E1979  movzx       eax,al  
001E197C  jmp         001E198E  
001E197E  mov         eax,1  
001E1983  jmp         001E198E  
001E1985  xor         eax,eax  
001E1987  jmp         001E198E  
001E1989  mov         eax,1  
001E198E  test        eax,eax  
001E1990  je          001E1993  
001E1992  inc         edi  
    68:                 index++;

我的优化版:

    87:             while(index < end) {
001E1E9B  test        ebx,ebx  
001E1E9D  jle         001E1ECE  
001E1E9F  mov         esi,dword ptr [ecx+4]  
    88:                 char c = text[index];
001E1EA2  cmp         edx,esi  
001E1EA4  jae         001E1F00  
001E1EA6  movzx       eax,word ptr [ecx+edx*2+8]  
    89:                 if(c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'))) lineNumber++;
001E1EAB  cmp         eax,0Dh  
001E1EAE  je          001E1EC8  
001E1EB0  cmp         eax,0Ah  
001E1EB3  jne         001E1EC9  
001E1EB5  test        edx,edx  
001E1EB7  je          001E1EC8  
001E1EB9  mov         eax,edx  
001E1EBB  dec         eax  
001E1EBC  cmp         eax,esi  
001E1EBE  jae         001E1F00  
001E1EC0  cmp         word ptr [ecx+eax*2+8],0Dh  
001E1EC6  je          001E1EC9  
001E1EC8  inc         edi  
    90:                 index++;

新指令:

001E1976  setne       al  
001E1979  movzx       eax,al  
001E197C  jmp         001E198E  
001E197E  mov         eax,1  
001E1983  jmp         001E198E  
001E1985  xor         eax,eax  
001E1987  jmp         001E198E  
001E1989  mov         eax,1  
001E198E  test        eax,eax  

我仍然看不到 performance/execution 速度有任何提高,但是...据推测这是由于 JIT 添加了额外的指令,我猜这是在没有内联自己有条件吗?

出于某种原因,在重新启动 VS、启用和重新禁用之前提到的设置以及重新应用 MethodImplOptions.AggressiveInlining 之后,该方法现在似乎是内联的(奇怪的是之前没有...)。但是,它添加了一些指令,这些指令在您手动内联 if 条件时不存在。

但是,efficiency/execution 速度似乎保持不变。 Hans Passant 建议我用常规 |& 替换短路运算符(在可能的情况下),这确实将速度差距从 2 倍缩小到 1.5 倍。我猜这在 JIT 优化方面已经是最好的了。

return c == '\r' | (c == '\n' & (index == 0 || text[index - 1] != '\r'));

我的一个有趣发现(或者,至少 很感兴趣,因为我并不真正理解这些汇编级优化在幕后是如何工作的)是当对手动内联条件(GetLineNumberInline() 内)执行相同的运算符交换,执行速度变得更差。

这次冒险的目的是获得尽可能高效的代码,而不必在我使用它的任何地方都复制它(因为在原始代码中 IsEndOfLine() 在整个项目中多次使用)。最后,我想我会坚持只在 GetLineNumber() 内复制 IsEndOfLine() 代码,因为事实证明,就执行速度而言,这是最快的。


我要感谢那些花时间帮助我的人(一些评论已被删除),虽然我没有达到我认为 JIT 优化内联会得到我的结果,我'我仍然学到了很多我以前不知道的东西。现在,我至少大致了解了 JIT 优化在幕后做了什么,以及它是如何比我最初想象的要复杂得多。


完整的基准测试结果,供日后参考(按执行时间排序):

Text length: 15882
Character position: 11912

Standard loop (inline):                    00:00:00.1429526 (10000 à 0.0142 ms)
Standard loop (inline unsafe):             00:00:00.1642801 (10000 à 0.0164 ms)
Standard loop (inline + no short-circuit): 00:00:00.3250843 (10000 à 0.0325 ms)
Standard loop (AggressiveInlining):        00:00:00.3318966 (10000 à 0.0331 ms)
Standard loop (unsafe):                    00:00:00.3605394 (10000 à 0.0360 ms)
Standard loop:                             00:00:00.3859629 (10000 à 0.0385 ms)
Regex (Substring):                         00:00:01.8794045 (10000 à 0.1879 ms)
Regex (MatchCollection loop):              00:00:02.4916785 (10000 à 0.2491 ms)

Resulting line: 284

/* "unsafe" is using pointers to access the string's characters */

class Program
{
    const int RUNS = 10000;

    static void Main(string[] args)
    {
        string text = "";
        Random r = new Random();

        //Some words to fill the string with.
        string[] words = new string[] { "Hello", "world", "Inventory.MaxAmount 32", "+QUICKTORETALIATE", "TNT1 AABBCC 6 A_JumpIf(ACS_ExecuteWithResult(460, 0, 0, 0) == 0, \"See0\")" };

        //Various line endings.
        string[] endings = new string[] { "\r\n", "\r", "\n" };



        /*
            Generate text
        */
        int lineCount = r.Next(256, 513);

        for(int l = 0; l < lineCount; l++) {
            int wordCount = r.Next(1, 4);
            text += new string(' ', r.Next(4, 9));

            for(int w = 0; w < wordCount; w++) {
                text += words[wordCount] + (w < wordCount - 1 ? " " : "");
            }

            text += endings[r.Next(0, endings.Length)];
        }

        Console.WriteLine("Text length: " + text.Length);
        Console.WriteLine();



        /*
            Initialize class and stopwatch
        */
        TextParser parser = new TextParser(text);
        Stopwatch sw = new Stopwatch();

        List<int> numbers = new List<int>(); //Using a list to prevent the compiler from optimizing-away the "GetLineNumber" call.



        /*
            Test 1 - Standard loop
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumber((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Standard loop: ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Test 2 - Standard loop (with AggressiveInlining)
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumber2((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Standard loop (AggressiveInlining): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Test 3 - Standard loop (with inline check)
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumberInline((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Standard loop (inline): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Test 4 - Standard loop (with inline and no short-circuiting)
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumberInline2((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Standard loop (inline + no short-circuit): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Test 5 - Standard loop (with unsafe check)
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumberUnsafe((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Standard loop (unsafe): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Test 6 - Standard loop (with inline + unsafe check)
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumberUnsafeInline((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Standard loop (inline unsafe): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Test 7 - Regex (with Substring)
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumberRegex((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Regex (Substring): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Test 8 - Regex (with MatchCollection loop)
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumberRegex2((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Regex (MatchCollection loop): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Tests completed
        */
        Console.Write("All tests completed. Press ENTER to close...");
        while(Console.ReadKey(true).Key != ConsoleKey.Enter);
    }
}
public class TextParser
{
    private static readonly Regex LineRegex = new Regex("\r\n|\r|\n", RegexOptions.Compiled);

    private string text;

    public TextParser(string text)
    {
        this.text = text;
    }

    /// <summary>
    /// Returns whether the specified character index is the end of a line.
    /// </summary>
    /// <param name="index">The index to check.</param>
    /// <returns></returns>
    private bool IsEndOfLine(int index)
    {
        char c = text[index];
        return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
    }

    /// <summary>
    /// Returns whether the specified character index is the end of a line.
    /// </summary>
    /// <param name="index">The index to check.</param>
    /// <returns></returns>
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    private bool IsEndOfLineAggressiveInlining(int index)
    {
        char c = text[index];
        return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
    }

    /// <summary>
    /// Returns whether the specified character index is the end of a line.
    /// </summary>
    /// <param name="index">The index to check.</param>
    /// <returns></returns>
    private bool IsEndOfLineUnsafe(int index)
    {
        unsafe
        {
            fixed(char* ptr = text) {
                char c = ptr[index];
                return c == '\r' || (c == '\n' && (index == 0 || ptr[index - 1] != '\r'));
            }
        }
    }



    /// <summary>
    /// Returns the number of the line at the specified character index.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumber(int index)
    {
        if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }

        int lineNumber = 1;
        int end = index;

        index = 0;
        while(index < end) {
            if(IsEndOfLine(index)) lineNumber++;
            index++;
        }

        return lineNumber;
    }



    /// <summary>
    /// Returns the number of the line at the specified character index.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumber2(int index)
    {
        if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }

        int lineNumber = 1;
        int end = index;

        index = 0;
        while(index < end) {
            if(IsEndOfLineAggressiveInlining(index)) lineNumber++;
            index++;
        }

        return lineNumber;
    }

    /// <summary>
    /// Returns the number of the line at the specified character index.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumberInline(int index)
    {
        if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }

        int lineNumber = 1;
        int end = index;

        index = 0;
        while(index < end) {
            char c = text[index];
            if(c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'))) lineNumber++;
            index++;
        }

        return lineNumber;
    }

    /// <summary>
    /// Returns the number of the line at the specified character index.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumberInline2(int index)
    {
        if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }

        int lineNumber = 1;
        int end = index;

        index = 0;
        while(index < end) {
            char c = text[index];
            if(c == '\r' | (c == '\n' & (index == 0 || text[index - 1] != '\r'))) lineNumber++;
            index++;
        }

        return lineNumber;
    }

    /// <summary>
    /// Returns the number of the line at the specified character index.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumberUnsafe(int index)
    {
        if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }

        int lineNumber = 1;
        int end = index;

        index = 0;
        while(index < end) {
            if(IsEndOfLineUnsafe(index)) lineNumber++;
            index++;
        }

        return lineNumber;
    }

    /// <summary>
    /// Returns the number of the line at the specified character index.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumberUnsafeInline(int index)
    {
        if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }

        int lineNumber = 1;
        int end = index;

        unsafe
        {
            fixed(char* ptr = text) {
                index = 0;
                while(index < end) {
                    char c = ptr[index];
                    if(c == '\r' || (c == '\n' && (index == 0 || ptr[index - 1] != '\r'))) lineNumber++;
                    index++;
                }
            }
        }

        return lineNumber;
    }

    /// <summary>
    /// Returns the number of the line at the specified character index. Utilizes a Regex.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumberRegex(int index)
    {
        return LineRegex.Matches(text.Substring(0, index)).Count + 1;
    }

    /// <summary>
    /// Returns the number of the line at the specified character index. Utilizes a Regex.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumberRegex2(int index)
    {
        int lineNumber = 1;
        MatchCollection mc = LineRegex.Matches(text);

        for(int y = 0; y < mc.Count; y++) {
            if(mc[y].Index >= index) break;
            lineNumber++;
        }

        return lineNumber;
    }
}