了解 C# 中 LINQ 中的惰性求值
Understanding lazy evaluation in LINQ in C#
我正在阅读 this article 有关 LINQ 的内容,无法理解查询是如何根据惰性求值执行的。
因此,我将文章中的示例简化为以下代码:
void Main()
{
var data =
from f in GetFirstSequence().LogQuery("GetFirstSequence")
from s in GetSecondSequence().LogQuery("GetSecondSequence", f)
select $"{f} {s}";
data.Dump(); // I use LINQPAD to output the data
}
static IEnumerable<string> GetFirstSequence()
{
yield return "a";
yield return "b";
yield return "c";
}
static IEnumerable<string> GetSecondSequence()
{
yield return "1";
yield return "2";
}
public static class Extensions
{
private const string path = @"C:\dist\debug.log";
public static IEnumerable<string> LogQuery(this IEnumerable<string> sequence, string tag, string element = null)
{
using (var writer = File.AppendText(path))
{
writer.WriteLine($"Executing query {tag} {element}");
}
return sequence;
}
}
执行此代码后,我在 debug.log 文件中得到了可以逻辑解释的输出:
Executing query GetFirstSequence
Executing query GetSecondSequence a
Executing query GetSecondSequence b
Executing query GetSecondSequence c
当我想像这样将前三个元素与后三个元素交错时,事情变得很奇怪:
void Main()
{
var data =
from f in GetFirstSequence().LogQuery("GetFirstSequence")
from s in GetSecondSequence().LogQuery("GetSecondSequence", f)
select $"{f} {s}";
var shuffle = data;
shuffle = shuffle.Take(3).LogQuery("Take")
.Interleave(shuffle.Skip(3).LogQuery("Skip")).LogQuery("Interleave");
shuffle.Dump();
}
当然我需要添加扩展方法来交错两个序列(从上面提到的文章中获得):
public static IEnumerable<string> Interleave(this IEnumerable<string> first, IEnumerable<string> second)
{
var firstIter = first.GetEnumerator();
var secondIter = second.GetEnumerator();
while (firstIter.MoveNext() && secondIter.MoveNext())
{
yield return firstIter.Current;
yield return secondIter.Current;
}
}
执行这些代码行后,我的 txt 文件中得到以下输出:
Executing query GetFirstSequence
Executing query Take
Executing query Skip
Executing query Interleave
Executing query GetSecondSequence a
Executing query GetSecondSequence a
Executing query GetSecondSequence b
Executing query GetSecondSequence c
Executing query GetSecondSequence b
这让我很尴尬,因为我不明白我的查询执行的顺序。
为什么查询是这样执行的?
var data =
from f in GetFirstSequence().LogQuery("GetFirstSequence")
from s in GetSecondSequence().LogQuery("GetSecondSequence", f)
select $"{f} {s}";
只是另一种写法
var data = GetFirstSequence()
.LogQuery("GetFirstSequence")
.SelectMany(f => GetSecondSequence().LogQuery("GetSecondSequence", f), (f, s) => $"{f} {s}");
让我们单步执行代码:
var data = GetFirstSequence() // returns an IEnumerable<string> without evaluating it
.LogQuery("GetFirstSequence") // writes "GetFirstSequence" and returns the IEnumerable<string> from its this-parameter without evaluating it
.SelectMany(f => GetSecondSequence().LogQuery("GetSecondSequence", f), (f, s) => $"{f} {s}"); // returns an IEnumerable<string> without evaluating it
var shuffle = data;
shuffle = shuffle
.Take(3) // returns an IEnumerable<string> without evaluating it
.LogQuery("Take") // writes "Take" and returns the IEnumerable<string> from its this-parameter without evaluating it
.Interleave(
shuffle
.Skip(3) // returns an IEnumerable<string> without evaluating it
.LogQuery("Skip") // writes "Skip" and returns the IEnumerable<string> from its this-parameter without evaluating it
) // returns an IEnumerable<string> without evaluating it
.LogQuery("Interleave"); // writes "Interleave" and returns the IEnumerable<string> from its this-parameter without evaluating it
到目前为止的代码负责前四行输出:
Executing query GetFirstSequence
Executing query Take
Executing query Skip
Executing query Interleave
None 的 IEnumerable 已被评估。
最后,shuffle.Dump()
遍历 shuffle
,从而计算 IEnumerables。
迭代 data
打印以下内容,因为 SelectMany()
为 GetFirstSequence()
中的每个元素调用 GetSecondSequence()
和 LogQuery()
:
Executing query GetSecondSequence a
Executing query GetSecondSequence b
Executing query GetSecondSequence c
遍历 shuffle
与遍历
相同
Interleave(data.Take(3), data.Skip(3))
Interleave()
将来自两次迭代的元素交织在 data
上,因此也交织了对它们进行迭代所产生的输出。
firstIter.MoveNext();
// writes "Executing query GetSecondSequence a"
secondIter.MoveNext();
// writes "Executing query GetSecondSequence a"
// skips "a 1" from second sequence
// skips "a 2" from second sequence
// writes "Executing query GetSecondSequence b"
// skips "b 1" from second sequence
yield return firstIter.Current; // "a 1"
yield return secondIter.Current; // "b 2"
firstIter.MoveNext();
secondIter.MoveNext();
// writes "Executing query GetSecondSequence c"
yield return firstIter.Current; // "a 2"
yield return secondIter.Current; // "c 1"
firstIter.MoveNext();
// writes "Executing query GetSecondSequence b"
secondIter.MoveNext();
yield return firstIter.Current; // "b 1"
yield return secondIter.Current; // "c 2"
我正在阅读 this article 有关 LINQ 的内容,无法理解查询是如何根据惰性求值执行的。
因此,我将文章中的示例简化为以下代码:
void Main()
{
var data =
from f in GetFirstSequence().LogQuery("GetFirstSequence")
from s in GetSecondSequence().LogQuery("GetSecondSequence", f)
select $"{f} {s}";
data.Dump(); // I use LINQPAD to output the data
}
static IEnumerable<string> GetFirstSequence()
{
yield return "a";
yield return "b";
yield return "c";
}
static IEnumerable<string> GetSecondSequence()
{
yield return "1";
yield return "2";
}
public static class Extensions
{
private const string path = @"C:\dist\debug.log";
public static IEnumerable<string> LogQuery(this IEnumerable<string> sequence, string tag, string element = null)
{
using (var writer = File.AppendText(path))
{
writer.WriteLine($"Executing query {tag} {element}");
}
return sequence;
}
}
执行此代码后,我在 debug.log 文件中得到了可以逻辑解释的输出:
Executing query GetFirstSequence
Executing query GetSecondSequence a
Executing query GetSecondSequence b
Executing query GetSecondSequence c
当我想像这样将前三个元素与后三个元素交错时,事情变得很奇怪:
void Main()
{
var data =
from f in GetFirstSequence().LogQuery("GetFirstSequence")
from s in GetSecondSequence().LogQuery("GetSecondSequence", f)
select $"{f} {s}";
var shuffle = data;
shuffle = shuffle.Take(3).LogQuery("Take")
.Interleave(shuffle.Skip(3).LogQuery("Skip")).LogQuery("Interleave");
shuffle.Dump();
}
当然我需要添加扩展方法来交错两个序列(从上面提到的文章中获得):
public static IEnumerable<string> Interleave(this IEnumerable<string> first, IEnumerable<string> second)
{
var firstIter = first.GetEnumerator();
var secondIter = second.GetEnumerator();
while (firstIter.MoveNext() && secondIter.MoveNext())
{
yield return firstIter.Current;
yield return secondIter.Current;
}
}
执行这些代码行后,我的 txt 文件中得到以下输出:
Executing query GetFirstSequence
Executing query Take
Executing query Skip
Executing query Interleave
Executing query GetSecondSequence a
Executing query GetSecondSequence a
Executing query GetSecondSequence b
Executing query GetSecondSequence c
Executing query GetSecondSequence b
这让我很尴尬,因为我不明白我的查询执行的顺序。
为什么查询是这样执行的?
var data =
from f in GetFirstSequence().LogQuery("GetFirstSequence")
from s in GetSecondSequence().LogQuery("GetSecondSequence", f)
select $"{f} {s}";
只是另一种写法
var data = GetFirstSequence()
.LogQuery("GetFirstSequence")
.SelectMany(f => GetSecondSequence().LogQuery("GetSecondSequence", f), (f, s) => $"{f} {s}");
让我们单步执行代码:
var data = GetFirstSequence() // returns an IEnumerable<string> without evaluating it
.LogQuery("GetFirstSequence") // writes "GetFirstSequence" and returns the IEnumerable<string> from its this-parameter without evaluating it
.SelectMany(f => GetSecondSequence().LogQuery("GetSecondSequence", f), (f, s) => $"{f} {s}"); // returns an IEnumerable<string> without evaluating it
var shuffle = data;
shuffle = shuffle
.Take(3) // returns an IEnumerable<string> without evaluating it
.LogQuery("Take") // writes "Take" and returns the IEnumerable<string> from its this-parameter without evaluating it
.Interleave(
shuffle
.Skip(3) // returns an IEnumerable<string> without evaluating it
.LogQuery("Skip") // writes "Skip" and returns the IEnumerable<string> from its this-parameter without evaluating it
) // returns an IEnumerable<string> without evaluating it
.LogQuery("Interleave"); // writes "Interleave" and returns the IEnumerable<string> from its this-parameter without evaluating it
到目前为止的代码负责前四行输出:
Executing query GetFirstSequence Executing query Take Executing query Skip Executing query Interleave
None 的 IEnumerable
最后,shuffle.Dump()
遍历 shuffle
,从而计算 IEnumerables。
迭代 data
打印以下内容,因为 SelectMany()
为 GetFirstSequence()
中的每个元素调用 GetSecondSequence()
和 LogQuery()
:
Executing query GetSecondSequence a Executing query GetSecondSequence b Executing query GetSecondSequence c
遍历 shuffle
与遍历
Interleave(data.Take(3), data.Skip(3))
Interleave()
将来自两次迭代的元素交织在 data
上,因此也交织了对它们进行迭代所产生的输出。
firstIter.MoveNext();
// writes "Executing query GetSecondSequence a"
secondIter.MoveNext();
// writes "Executing query GetSecondSequence a"
// skips "a 1" from second sequence
// skips "a 2" from second sequence
// writes "Executing query GetSecondSequence b"
// skips "b 1" from second sequence
yield return firstIter.Current; // "a 1"
yield return secondIter.Current; // "b 2"
firstIter.MoveNext();
secondIter.MoveNext();
// writes "Executing query GetSecondSequence c"
yield return firstIter.Current; // "a 2"
yield return secondIter.Current; // "c 1"
firstIter.MoveNext();
// writes "Executing query GetSecondSequence b"
secondIter.MoveNext();
yield return firstIter.Current; // "b 1"
yield return secondIter.Current; // "c 2"