即使似乎满足所有条件,JIT 编译器也不会内联方法
Method is not inlined by the JIT compiler even though all criteria seems to be met
背景
在编写用于解析某些文本的 class 时,我需要能够获取特定字符位置的行号(换句话说,计算该字符之前出现的所有换行符)。
为了找到实现此目的的最有效代码,我设置了几个基准测试,结果表明 Regex 是最慢的方法,而手动迭代字符串是最快的。
以下是我目前的方法(10k 次迭代:278 毫秒):
private string text;
/// <summary>
/// Returns whether the specified character index is the end of a line.
/// </summary>
/// <param name="index">The index to check.</param>
/// <returns></returns>
private bool IsEndOfLine(int index)
{
//Matches "\r" and "\n" (but not "\n" if it's preceded by "\r").
char c = text[index];
return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
}
/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumber(int index)
{
if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }
int lineNumber = 1;
int end = index;
index = 0;
while(index < end) {
if(IsEndOfLine(index)) lineNumber++;
index++;
}
return lineNumber;
}
然而,在做这些基准测试时,我记得方法调用有时会有点昂贵,所以我决定尝试将条件从 IsEndOfLine()
直接移动到 if
语句中 GetLineNumber()
还有。
如我所料,执行速度快两倍以上(10k 次迭代:112 毫秒):
while(index < end) {
char c = text[index];
if(c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'))) lineNumber++;
index++;
}
问题
据我所知,JIT 编译器没有(或者至少没有)优化大小超过 32 字节的 IL 代码[1] unless [MethodImplAttribute(MethodImplOptions.AggressiveInlining)]
is specified[2]。但是尽管将此属性应用于 IsEndOfLine()
,但似乎没有发生内联。
我能找到的大部分关于此的讨论都来自较早的 posts/articles。在最新的一个([2] 从 2012 年开始)中,作者显然使用 MethodImplOptions.AggressiveInlining
成功地内联了一个 34 字节的函数,这意味着如果满足所有其他条件,该标志允许内联更大的 IL 代码。
使用以下代码测量我的方法的大小表明它有 54 个字节长:
Console.WriteLine(this.GetType().GetMethod("IsEndOfLine").GetMethodBody().GetILAsByteArray().Length);
在 VS 2019 中使用 Dissasembly window 显示以下 IsEndOfLine()
的汇编代码(C# 源代码在 查看选项):
(配置:Release (x86),禁用Just My Code and Suppress JIT optimization on module load)
--- [PATH REMOVED]\Performance Test - Find text line number\TextParser.cs
28: char c = text[index];
001E19BA in al,dx
001E19BB mov eax,dword ptr [ecx+4]
001E19BE cmp edx,dword ptr [eax+4]
001E19C1 jae 001E19FF
001E19C3 movzx eax,word ptr [eax+edx*2+8]
29: return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
001E19C8 cmp eax,0Dh
001E19CB je 001E19F8
001E19CD cmp eax,0Ah
001E19D0 jne 001E19F4
001E19D2 test edx,edx
001E19D4 je 001E19ED
001E19D6 dec edx
001E19D7 mov eax,dword ptr [ecx+4]
001E19DA cmp edx,dword ptr [eax+4]
001E19DD jae 001E19FF
001E19DF cmp word ptr [eax+edx*2+8],0Dh
001E19E5 setne al
001E19E8 movzx eax,al
001E19EB pop ebp
001E19EC ret
001E19ED mov eax,1
001E19F2 pop ebp
001E19F3 ret
001E19F4 xor eax,eax
001E19F6 pop ebp
001E19F7 ret
001E19F8 mov eax,1
001E19FD pop ebp
001E19FE ret
001E19FF call 70C2E2B0
001E1A04 int 3
...以及 GetLineNumber()
中循环的以下代码:
63: index = 0;
001E1950 xor esi,esi
64: while(index < end) {
001E1952 test ebx,ebx
001E1954 jle 001E196C
001E1956 mov ecx,edi
001E1958 mov edx,esi
001E195A call dword ptr ds:[144E10h]
001E1960 test eax,eax
001E1962 je 001E1967
65: if(IsEndOfLine(index)) lineNumber++;
001E1964 inc dword ptr [ebp-10h]
66: index++;
001E1967 inc esi
64: while(index < end) {
001E1968 cmp esi,ebx
001E196A jl 001E1956
67: }
68:
69: return lineNumber;
001E196C mov eax,dword ptr [ebp-10h]
001E196F pop ecx
001E1970 pop ebx
001E1971 pop esi
001E1972 pop edi
001E1973 pop ebp
001E1974 ret
我不太擅长阅读汇编代码,但在我看来似乎没有发生内联。
问题
为什么即使指定了 MethodImplOptions.AggressiveInlining
,JIT 编译器也不内联我的 IsEndOfLine()
方法?我知道这个标志只是对编译器的一个提示,但基于 [2] 应用它应该可以内联 IL 大于 大于 32 字节。除此之外,对我来说,我的代码似乎满足所有其他条件。
我还缺少其他类型的限制吗?
基准
结果:
Text length: 11645
Line: 201
Standard loop: 00:00:00.2779946 (10000 à 00:00:00.0000277)
Line: 201
Standard loop (inline): 00:00:00.1122908 (10000 à 00:00:00.0000112)
<为简洁起见,基准代码已移至 >
脚注
1 To Inline or not to Inline: That is the question
2 Aggressive Inlining in the CLR 4.5 JIT
-- 编辑--
出于某种原因,在重新启动 VS、启用和重新禁用之前提到的设置以及重新应用 MethodImplOptions.AggressiveInlining
之后,该方法现在似乎是内联的。但是,它添加了一些当您手动内联 if
条件时不存在的指令。
JIT 优化版本:
66: while(index < end) {
001E194B test ebx,ebx
001E194D jle 001E1998
001E194F mov esi,dword ptr [ecx+4]
67: if(IsEndOfLine(index)) lineNumber++;
001E1952 cmp edx,esi
001E1954 jae 001E19CA
001E1956 movzx eax,word ptr [ecx+edx*2+8]
001E195B cmp eax,0Dh
001E195E je 001E1989
001E1960 cmp eax,0Ah
001E1963 jne 001E1985
001E1965 test edx,edx
001E1967 je 001E197E
001E1969 mov eax,edx
001E196B dec eax
001E196C cmp eax,esi
001E196E jae 001E19CA
001E1970 cmp word ptr [ecx+eax*2+8],0Dh
001E1976 setne al
001E1979 movzx eax,al
001E197C jmp 001E198E
001E197E mov eax,1
001E1983 jmp 001E198E
001E1985 xor eax,eax
001E1987 jmp 001E198E
001E1989 mov eax,1
001E198E test eax,eax
001E1990 je 001E1993
001E1992 inc edi
68: index++;
我的优化版:
87: while(index < end) {
001E1E9B test ebx,ebx
001E1E9D jle 001E1ECE
001E1E9F mov esi,dword ptr [ecx+4]
88: char c = text[index];
001E1EA2 cmp edx,esi
001E1EA4 jae 001E1F00
001E1EA6 movzx eax,word ptr [ecx+edx*2+8]
89: if(c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'))) lineNumber++;
001E1EAB cmp eax,0Dh
001E1EAE je 001E1EC8
001E1EB0 cmp eax,0Ah
001E1EB3 jne 001E1EC9
001E1EB5 test edx,edx
001E1EB7 je 001E1EC8
001E1EB9 mov eax,edx
001E1EBB dec eax
001E1EBC cmp eax,esi
001E1EBE jae 001E1F00
001E1EC0 cmp word ptr [ecx+eax*2+8],0Dh
001E1EC6 je 001E1EC9
001E1EC8 inc edi
90: index++;
新指令:
001E1976 setne al
001E1979 movzx eax,al
001E197C jmp 001E198E
001E197E mov eax,1
001E1983 jmp 001E198E
001E1985 xor eax,eax
001E1987 jmp 001E198E
001E1989 mov eax,1
001E198E test eax,eax
我仍然看不到 performance/execution 速度有任何提高,但是...据推测这是由于 JIT 添加了额外的指令,我猜这是在没有内联自己有条件吗?
出于某种原因,在重新启动 VS、启用和重新禁用之前提到的设置以及重新应用 MethodImplOptions.AggressiveInlining
之后,该方法现在似乎是内联的(奇怪的是之前没有...)。但是,它添加了一些指令,这些指令在您手动内联 if
条件时不存在。
但是,efficiency/execution 速度似乎保持不变。 Hans Passant 建议我用常规 |
和 &
替换短路运算符(在可能的情况下),这确实将速度差距从 2 倍缩小到 1.5 倍。我猜这在 JIT 优化方面已经是最好的了。
return c == '\r' | (c == '\n' & (index == 0 || text[index - 1] != '\r'));
我的一个有趣发现(或者,至少 我 很感兴趣,因为我并不真正理解这些汇编级优化在幕后是如何工作的)是当对手动内联条件(GetLineNumberInline()
内)执行相同的运算符交换,执行速度变得更差。
这次冒险的目的是获得尽可能高效的代码,而不必在我使用它的任何地方都复制它(因为在原始代码中 IsEndOfLine()
在整个项目中多次使用)。最后,我想我会坚持只在 GetLineNumber()
内复制 IsEndOfLine()
代码,因为事实证明,就执行速度而言,这是最快的。
我要感谢那些花时间帮助我的人(一些评论已被删除),虽然我没有达到我认为 JIT 优化内联会得到我的结果,我'我仍然学到了很多我以前不知道的东西。现在,我至少大致了解了 JIT 优化在幕后做了什么,以及它是如何比我最初想象的要复杂得多。
完整的基准测试结果,供日后参考(按执行时间排序):
Text length: 15882
Character position: 11912
Standard loop (inline): 00:00:00.1429526 (10000 à 0.0142 ms)
Standard loop (inline unsafe): 00:00:00.1642801 (10000 à 0.0164 ms)
Standard loop (inline + no short-circuit): 00:00:00.3250843 (10000 à 0.0325 ms)
Standard loop (AggressiveInlining): 00:00:00.3318966 (10000 à 0.0331 ms)
Standard loop (unsafe): 00:00:00.3605394 (10000 à 0.0360 ms)
Standard loop: 00:00:00.3859629 (10000 à 0.0385 ms)
Regex (Substring): 00:00:01.8794045 (10000 à 0.1879 ms)
Regex (MatchCollection loop): 00:00:02.4916785 (10000 à 0.2491 ms)
Resulting line: 284
/* "unsafe" is using pointers to access the string's characters */
class Program
{
const int RUNS = 10000;
static void Main(string[] args)
{
string text = "";
Random r = new Random();
//Some words to fill the string with.
string[] words = new string[] { "Hello", "world", "Inventory.MaxAmount 32", "+QUICKTORETALIATE", "TNT1 AABBCC 6 A_JumpIf(ACS_ExecuteWithResult(460, 0, 0, 0) == 0, \"See0\")" };
//Various line endings.
string[] endings = new string[] { "\r\n", "\r", "\n" };
/*
Generate text
*/
int lineCount = r.Next(256, 513);
for(int l = 0; l < lineCount; l++) {
int wordCount = r.Next(1, 4);
text += new string(' ', r.Next(4, 9));
for(int w = 0; w < wordCount; w++) {
text += words[wordCount] + (w < wordCount - 1 ? " " : "");
}
text += endings[r.Next(0, endings.Length)];
}
Console.WriteLine("Text length: " + text.Length);
Console.WriteLine();
/*
Initialize class and stopwatch
*/
TextParser parser = new TextParser(text);
Stopwatch sw = new Stopwatch();
List<int> numbers = new List<int>(); //Using a list to prevent the compiler from optimizing-away the "GetLineNumber" call.
/*
Test 1 - Standard loop
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumber((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Standard loop: ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Test 2 - Standard loop (with AggressiveInlining)
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumber2((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Standard loop (AggressiveInlining): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Test 3 - Standard loop (with inline check)
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumberInline((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Standard loop (inline): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Test 4 - Standard loop (with inline and no short-circuiting)
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumberInline2((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Standard loop (inline + no short-circuit): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Test 5 - Standard loop (with unsafe check)
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumberUnsafe((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Standard loop (unsafe): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Test 6 - Standard loop (with inline + unsafe check)
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumberUnsafeInline((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Standard loop (inline unsafe): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Test 7 - Regex (with Substring)
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumberRegex((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Regex (Substring): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Test 8 - Regex (with MatchCollection loop)
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumberRegex2((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Regex (MatchCollection loop): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Tests completed
*/
Console.Write("All tests completed. Press ENTER to close...");
while(Console.ReadKey(true).Key != ConsoleKey.Enter);
}
}
public class TextParser
{
private static readonly Regex LineRegex = new Regex("\r\n|\r|\n", RegexOptions.Compiled);
private string text;
public TextParser(string text)
{
this.text = text;
}
/// <summary>
/// Returns whether the specified character index is the end of a line.
/// </summary>
/// <param name="index">The index to check.</param>
/// <returns></returns>
private bool IsEndOfLine(int index)
{
char c = text[index];
return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
}
/// <summary>
/// Returns whether the specified character index is the end of a line.
/// </summary>
/// <param name="index">The index to check.</param>
/// <returns></returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private bool IsEndOfLineAggressiveInlining(int index)
{
char c = text[index];
return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
}
/// <summary>
/// Returns whether the specified character index is the end of a line.
/// </summary>
/// <param name="index">The index to check.</param>
/// <returns></returns>
private bool IsEndOfLineUnsafe(int index)
{
unsafe
{
fixed(char* ptr = text) {
char c = ptr[index];
return c == '\r' || (c == '\n' && (index == 0 || ptr[index - 1] != '\r'));
}
}
}
/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumber(int index)
{
if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }
int lineNumber = 1;
int end = index;
index = 0;
while(index < end) {
if(IsEndOfLine(index)) lineNumber++;
index++;
}
return lineNumber;
}
/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumber2(int index)
{
if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }
int lineNumber = 1;
int end = index;
index = 0;
while(index < end) {
if(IsEndOfLineAggressiveInlining(index)) lineNumber++;
index++;
}
return lineNumber;
}
/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumberInline(int index)
{
if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }
int lineNumber = 1;
int end = index;
index = 0;
while(index < end) {
char c = text[index];
if(c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'))) lineNumber++;
index++;
}
return lineNumber;
}
/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumberInline2(int index)
{
if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }
int lineNumber = 1;
int end = index;
index = 0;
while(index < end) {
char c = text[index];
if(c == '\r' | (c == '\n' & (index == 0 || text[index - 1] != '\r'))) lineNumber++;
index++;
}
return lineNumber;
}
/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumberUnsafe(int index)
{
if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }
int lineNumber = 1;
int end = index;
index = 0;
while(index < end) {
if(IsEndOfLineUnsafe(index)) lineNumber++;
index++;
}
return lineNumber;
}
/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumberUnsafeInline(int index)
{
if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }
int lineNumber = 1;
int end = index;
unsafe
{
fixed(char* ptr = text) {
index = 0;
while(index < end) {
char c = ptr[index];
if(c == '\r' || (c == '\n' && (index == 0 || ptr[index - 1] != '\r'))) lineNumber++;
index++;
}
}
}
return lineNumber;
}
/// <summary>
/// Returns the number of the line at the specified character index. Utilizes a Regex.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumberRegex(int index)
{
return LineRegex.Matches(text.Substring(0, index)).Count + 1;
}
/// <summary>
/// Returns the number of the line at the specified character index. Utilizes a Regex.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumberRegex2(int index)
{
int lineNumber = 1;
MatchCollection mc = LineRegex.Matches(text);
for(int y = 0; y < mc.Count; y++) {
if(mc[y].Index >= index) break;
lineNumber++;
}
return lineNumber;
}
}
背景
在编写用于解析某些文本的 class 时,我需要能够获取特定字符位置的行号(换句话说,计算该字符之前出现的所有换行符)。
为了找到实现此目的的最有效代码,我设置了几个基准测试,结果表明 Regex 是最慢的方法,而手动迭代字符串是最快的。
以下是我目前的方法(10k 次迭代:278 毫秒):
private string text;
/// <summary>
/// Returns whether the specified character index is the end of a line.
/// </summary>
/// <param name="index">The index to check.</param>
/// <returns></returns>
private bool IsEndOfLine(int index)
{
//Matches "\r" and "\n" (but not "\n" if it's preceded by "\r").
char c = text[index];
return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
}
/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumber(int index)
{
if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }
int lineNumber = 1;
int end = index;
index = 0;
while(index < end) {
if(IsEndOfLine(index)) lineNumber++;
index++;
}
return lineNumber;
}
然而,在做这些基准测试时,我记得方法调用有时会有点昂贵,所以我决定尝试将条件从 IsEndOfLine()
直接移动到 if
语句中 GetLineNumber()
还有。
如我所料,执行速度快两倍以上(10k 次迭代:112 毫秒):
while(index < end) {
char c = text[index];
if(c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'))) lineNumber++;
index++;
}
问题
据我所知,JIT 编译器没有(或者至少没有)优化大小超过 32 字节的 IL 代码[1] unless [MethodImplAttribute(MethodImplOptions.AggressiveInlining)]
is specified[2]。但是尽管将此属性应用于 IsEndOfLine()
,但似乎没有发生内联。
我能找到的大部分关于此的讨论都来自较早的 posts/articles。在最新的一个([2] 从 2012 年开始)中,作者显然使用 MethodImplOptions.AggressiveInlining
成功地内联了一个 34 字节的函数,这意味着如果满足所有其他条件,该标志允许内联更大的 IL 代码。
使用以下代码测量我的方法的大小表明它有 54 个字节长:
Console.WriteLine(this.GetType().GetMethod("IsEndOfLine").GetMethodBody().GetILAsByteArray().Length);
在 VS 2019 中使用 Dissasembly window 显示以下 IsEndOfLine()
的汇编代码(C# 源代码在 查看选项):
(配置:Release (x86),禁用Just My Code and Suppress JIT optimization on module load)
--- [PATH REMOVED]\Performance Test - Find text line number\TextParser.cs
28: char c = text[index];
001E19BA in al,dx
001E19BB mov eax,dword ptr [ecx+4]
001E19BE cmp edx,dword ptr [eax+4]
001E19C1 jae 001E19FF
001E19C3 movzx eax,word ptr [eax+edx*2+8]
29: return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
001E19C8 cmp eax,0Dh
001E19CB je 001E19F8
001E19CD cmp eax,0Ah
001E19D0 jne 001E19F4
001E19D2 test edx,edx
001E19D4 je 001E19ED
001E19D6 dec edx
001E19D7 mov eax,dword ptr [ecx+4]
001E19DA cmp edx,dword ptr [eax+4]
001E19DD jae 001E19FF
001E19DF cmp word ptr [eax+edx*2+8],0Dh
001E19E5 setne al
001E19E8 movzx eax,al
001E19EB pop ebp
001E19EC ret
001E19ED mov eax,1
001E19F2 pop ebp
001E19F3 ret
001E19F4 xor eax,eax
001E19F6 pop ebp
001E19F7 ret
001E19F8 mov eax,1
001E19FD pop ebp
001E19FE ret
001E19FF call 70C2E2B0
001E1A04 int 3
...以及 GetLineNumber()
中循环的以下代码:
63: index = 0;
001E1950 xor esi,esi
64: while(index < end) {
001E1952 test ebx,ebx
001E1954 jle 001E196C
001E1956 mov ecx,edi
001E1958 mov edx,esi
001E195A call dword ptr ds:[144E10h]
001E1960 test eax,eax
001E1962 je 001E1967
65: if(IsEndOfLine(index)) lineNumber++;
001E1964 inc dword ptr [ebp-10h]
66: index++;
001E1967 inc esi
64: while(index < end) {
001E1968 cmp esi,ebx
001E196A jl 001E1956
67: }
68:
69: return lineNumber;
001E196C mov eax,dword ptr [ebp-10h]
001E196F pop ecx
001E1970 pop ebx
001E1971 pop esi
001E1972 pop edi
001E1973 pop ebp
001E1974 ret
我不太擅长阅读汇编代码,但在我看来似乎没有发生内联。
问题
为什么即使指定了 MethodImplOptions.AggressiveInlining
,JIT 编译器也不内联我的 IsEndOfLine()
方法?我知道这个标志只是对编译器的一个提示,但基于 [2] 应用它应该可以内联 IL 大于 大于 32 字节。除此之外,对我来说,我的代码似乎满足所有其他条件。
我还缺少其他类型的限制吗?
基准
结果:
Text length: 11645
Line: 201
Standard loop: 00:00:00.2779946 (10000 à 00:00:00.0000277)
Line: 201
Standard loop (inline): 00:00:00.1122908 (10000 à 00:00:00.0000112)
<为简洁起见,基准代码已移至
脚注
1 To Inline or not to Inline: That is the question
2 Aggressive Inlining in the CLR 4.5 JIT
-- 编辑--
出于某种原因,在重新启动 VS、启用和重新禁用之前提到的设置以及重新应用 MethodImplOptions.AggressiveInlining
之后,该方法现在似乎是内联的。但是,它添加了一些当您手动内联 if
条件时不存在的指令。
JIT 优化版本:
66: while(index < end) {
001E194B test ebx,ebx
001E194D jle 001E1998
001E194F mov esi,dword ptr [ecx+4]
67: if(IsEndOfLine(index)) lineNumber++;
001E1952 cmp edx,esi
001E1954 jae 001E19CA
001E1956 movzx eax,word ptr [ecx+edx*2+8]
001E195B cmp eax,0Dh
001E195E je 001E1989
001E1960 cmp eax,0Ah
001E1963 jne 001E1985
001E1965 test edx,edx
001E1967 je 001E197E
001E1969 mov eax,edx
001E196B dec eax
001E196C cmp eax,esi
001E196E jae 001E19CA
001E1970 cmp word ptr [ecx+eax*2+8],0Dh
001E1976 setne al
001E1979 movzx eax,al
001E197C jmp 001E198E
001E197E mov eax,1
001E1983 jmp 001E198E
001E1985 xor eax,eax
001E1987 jmp 001E198E
001E1989 mov eax,1
001E198E test eax,eax
001E1990 je 001E1993
001E1992 inc edi
68: index++;
我的优化版:
87: while(index < end) {
001E1E9B test ebx,ebx
001E1E9D jle 001E1ECE
001E1E9F mov esi,dword ptr [ecx+4]
88: char c = text[index];
001E1EA2 cmp edx,esi
001E1EA4 jae 001E1F00
001E1EA6 movzx eax,word ptr [ecx+edx*2+8]
89: if(c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'))) lineNumber++;
001E1EAB cmp eax,0Dh
001E1EAE je 001E1EC8
001E1EB0 cmp eax,0Ah
001E1EB3 jne 001E1EC9
001E1EB5 test edx,edx
001E1EB7 je 001E1EC8
001E1EB9 mov eax,edx
001E1EBB dec eax
001E1EBC cmp eax,esi
001E1EBE jae 001E1F00
001E1EC0 cmp word ptr [ecx+eax*2+8],0Dh
001E1EC6 je 001E1EC9
001E1EC8 inc edi
90: index++;
新指令:
001E1976 setne al
001E1979 movzx eax,al
001E197C jmp 001E198E
001E197E mov eax,1
001E1983 jmp 001E198E
001E1985 xor eax,eax
001E1987 jmp 001E198E
001E1989 mov eax,1
001E198E test eax,eax
我仍然看不到 performance/execution 速度有任何提高,但是...据推测这是由于 JIT 添加了额外的指令,我猜这是在没有内联自己有条件吗?
出于某种原因,在重新启动 VS、启用和重新禁用之前提到的设置以及重新应用 MethodImplOptions.AggressiveInlining
之后,该方法现在似乎是内联的(奇怪的是之前没有...)。但是,它添加了一些指令,这些指令在您手动内联 if
条件时不存在。
但是,efficiency/execution 速度似乎保持不变。 Hans Passant 建议我用常规 |
和 &
替换短路运算符(在可能的情况下),这确实将速度差距从 2 倍缩小到 1.5 倍。我猜这在 JIT 优化方面已经是最好的了。
return c == '\r' | (c == '\n' & (index == 0 || text[index - 1] != '\r'));
我的一个有趣发现(或者,至少 我 很感兴趣,因为我并不真正理解这些汇编级优化在幕后是如何工作的)是当对手动内联条件(GetLineNumberInline()
内)执行相同的运算符交换,执行速度变得更差。
这次冒险的目的是获得尽可能高效的代码,而不必在我使用它的任何地方都复制它(因为在原始代码中 IsEndOfLine()
在整个项目中多次使用)。最后,我想我会坚持只在 GetLineNumber()
内复制 IsEndOfLine()
代码,因为事实证明,就执行速度而言,这是最快的。
我要感谢那些花时间帮助我的人(一些评论已被删除),虽然我没有达到我认为 JIT 优化内联会得到我的结果,我'我仍然学到了很多我以前不知道的东西。现在,我至少大致了解了 JIT 优化在幕后做了什么,以及它是如何比我最初想象的要复杂得多。
完整的基准测试结果,供日后参考(按执行时间排序):
Text length: 15882 Character position: 11912 Standard loop (inline): 00:00:00.1429526 (10000 à 0.0142 ms) Standard loop (inline unsafe): 00:00:00.1642801 (10000 à 0.0164 ms) Standard loop (inline + no short-circuit): 00:00:00.3250843 (10000 à 0.0325 ms) Standard loop (AggressiveInlining): 00:00:00.3318966 (10000 à 0.0331 ms) Standard loop (unsafe): 00:00:00.3605394 (10000 à 0.0360 ms) Standard loop: 00:00:00.3859629 (10000 à 0.0385 ms) Regex (Substring): 00:00:01.8794045 (10000 à 0.1879 ms) Regex (MatchCollection loop): 00:00:02.4916785 (10000 à 0.2491 ms) Resulting line: 284 /* "unsafe" is using pointers to access the string's characters */
class Program
{
const int RUNS = 10000;
static void Main(string[] args)
{
string text = "";
Random r = new Random();
//Some words to fill the string with.
string[] words = new string[] { "Hello", "world", "Inventory.MaxAmount 32", "+QUICKTORETALIATE", "TNT1 AABBCC 6 A_JumpIf(ACS_ExecuteWithResult(460, 0, 0, 0) == 0, \"See0\")" };
//Various line endings.
string[] endings = new string[] { "\r\n", "\r", "\n" };
/*
Generate text
*/
int lineCount = r.Next(256, 513);
for(int l = 0; l < lineCount; l++) {
int wordCount = r.Next(1, 4);
text += new string(' ', r.Next(4, 9));
for(int w = 0; w < wordCount; w++) {
text += words[wordCount] + (w < wordCount - 1 ? " " : "");
}
text += endings[r.Next(0, endings.Length)];
}
Console.WriteLine("Text length: " + text.Length);
Console.WriteLine();
/*
Initialize class and stopwatch
*/
TextParser parser = new TextParser(text);
Stopwatch sw = new Stopwatch();
List<int> numbers = new List<int>(); //Using a list to prevent the compiler from optimizing-away the "GetLineNumber" call.
/*
Test 1 - Standard loop
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumber((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Standard loop: ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Test 2 - Standard loop (with AggressiveInlining)
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumber2((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Standard loop (AggressiveInlining): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Test 3 - Standard loop (with inline check)
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumberInline((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Standard loop (inline): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Test 4 - Standard loop (with inline and no short-circuiting)
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumberInline2((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Standard loop (inline + no short-circuit): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Test 5 - Standard loop (with unsafe check)
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumberUnsafe((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Standard loop (unsafe): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Test 6 - Standard loop (with inline + unsafe check)
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumberUnsafeInline((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Standard loop (inline unsafe): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Test 7 - Regex (with Substring)
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumberRegex((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Regex (Substring): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Test 8 - Regex (with MatchCollection loop)
*/
sw.Restart();
for(int x = 0; x < RUNS; x++) {
numbers.Add(parser.GetLineNumberRegex2((int)(text.Length * 0.75) + r.Next(-4, 4)));
}
sw.Stop();
Console.WriteLine("Line: " + numbers[0]);
Console.WriteLine("Regex (MatchCollection loop): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
Console.WriteLine();
numbers = new List<int>();
/*
Tests completed
*/
Console.Write("All tests completed. Press ENTER to close...");
while(Console.ReadKey(true).Key != ConsoleKey.Enter);
}
}
public class TextParser
{
private static readonly Regex LineRegex = new Regex("\r\n|\r|\n", RegexOptions.Compiled);
private string text;
public TextParser(string text)
{
this.text = text;
}
/// <summary>
/// Returns whether the specified character index is the end of a line.
/// </summary>
/// <param name="index">The index to check.</param>
/// <returns></returns>
private bool IsEndOfLine(int index)
{
char c = text[index];
return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
}
/// <summary>
/// Returns whether the specified character index is the end of a line.
/// </summary>
/// <param name="index">The index to check.</param>
/// <returns></returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private bool IsEndOfLineAggressiveInlining(int index)
{
char c = text[index];
return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
}
/// <summary>
/// Returns whether the specified character index is the end of a line.
/// </summary>
/// <param name="index">The index to check.</param>
/// <returns></returns>
private bool IsEndOfLineUnsafe(int index)
{
unsafe
{
fixed(char* ptr = text) {
char c = ptr[index];
return c == '\r' || (c == '\n' && (index == 0 || ptr[index - 1] != '\r'));
}
}
}
/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumber(int index)
{
if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }
int lineNumber = 1;
int end = index;
index = 0;
while(index < end) {
if(IsEndOfLine(index)) lineNumber++;
index++;
}
return lineNumber;
}
/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumber2(int index)
{
if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }
int lineNumber = 1;
int end = index;
index = 0;
while(index < end) {
if(IsEndOfLineAggressiveInlining(index)) lineNumber++;
index++;
}
return lineNumber;
}
/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumberInline(int index)
{
if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }
int lineNumber = 1;
int end = index;
index = 0;
while(index < end) {
char c = text[index];
if(c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'))) lineNumber++;
index++;
}
return lineNumber;
}
/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumberInline2(int index)
{
if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }
int lineNumber = 1;
int end = index;
index = 0;
while(index < end) {
char c = text[index];
if(c == '\r' | (c == '\n' & (index == 0 || text[index - 1] != '\r'))) lineNumber++;
index++;
}
return lineNumber;
}
/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumberUnsafe(int index)
{
if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }
int lineNumber = 1;
int end = index;
index = 0;
while(index < end) {
if(IsEndOfLineUnsafe(index)) lineNumber++;
index++;
}
return lineNumber;
}
/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumberUnsafeInline(int index)
{
if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }
int lineNumber = 1;
int end = index;
unsafe
{
fixed(char* ptr = text) {
index = 0;
while(index < end) {
char c = ptr[index];
if(c == '\r' || (c == '\n' && (index == 0 || ptr[index - 1] != '\r'))) lineNumber++;
index++;
}
}
}
return lineNumber;
}
/// <summary>
/// Returns the number of the line at the specified character index. Utilizes a Regex.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumberRegex(int index)
{
return LineRegex.Matches(text.Substring(0, index)).Count + 1;
}
/// <summary>
/// Returns the number of the line at the specified character index. Utilizes a Regex.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumberRegex2(int index)
{
int lineNumber = 1;
MatchCollection mc = LineRegex.Matches(text);
for(int y = 0; y < mc.Count; y++) {
if(mc[y].Index >= index) break;
lineNumber++;
}
return lineNumber;
}
}