Apache Ignite:如何提高插入性能?
Apache Ignite: How can I improve insertion performance?
除了使用 IDataStreamer
和 IBinaryObject
来减少 Apache Ignite.NET 的插入时间之外,我还能做些什么?有可能获得显着的性能提升,还是已经达到了预期的效果?
我正在使用:
- .NET
- 41 个查询字段:每行 1 个字符串字段和 40 个浮点字段
IBinaryObject
/ WithKeepBinary
IDataStreamer
- 默认 JVM 设置
- 分区缓存
- 没有坚持
这是我对 IDataStreamer
的用法:
using (var ds = m_ignite.GetDataStreamer<string, IBinaryObject>(CacheName)) {
foreach (var binaryRow in rows.Select(r => BuildRow(r))) {
var key = binaryRow.GetField<string>(PrimaryKeyName);
ds.AddData(key, binaryRow);
}
}
性能结果:(5个节点均具有相同规格)
BenchmarkDotNet=v0.10.8, OS=Windows 8.1 (6.3.9600)
Processor=Intel Xeon CPU E5-2698 v4 2.20GHz Intel Xeon CPU E5-2698 v4 2.20GHz, ProcessorCount=4
Frequency=14318180 Hz, Resolution=69.8413 ns, Timer=HPET
[Host] : Clr 4.0.30319.42000, 64bit RyuJIT-v4.7.2053.0
Job-UZDKMF : Clr 4.0.30319.42000, 64bit RyuJIT-v4.7.2053.0
RunStrategy=Monitoring TargetCount=1
NumRows Mean (ms) Per Row (ms/row)
10 359.50* 35.95*
100 465.50* 4.66*
1,000 797.80* 0.80*
10,000 4,479.80 0.45
100,000 37,611.60 0.38
500,000 184,640.00 0.37
1,000,000 366,801.40 0.37
2,000,000 732,562.40 0.37
4,000,000 1,458,913.60 0.36
*Measurement is larger because it also measures some lightweight work before inserting the rows
任何提示、技巧或文档都将受到赞赏。谢谢!
不要调用 GetField 来检索密钥,return 直接从 BuildRow(即 return KeyValuePair<string, IBinaryObject>
)
并行插入(和 BuildRow
调用):
Parallel.ForEach(rows, r =>
{
KeyValuePair<string, IBinaryObject> pair = BuildRow(r);
ds.AddData(pair);
});
运行 更多机器上的更多 Ignite 节点
如果行来自外部数据源,你可以让每个Ignite节点只加载相关部分。您可以通过 ICompute.Broadcast
在每一行上执行 DataStreamer 来做到这一点,并在遍历行时检查键是否属于当前节点:
IAffinity aff = m_ignite.GetAffinity(cacheName);
IClusterNode localNode = m_ignite.GetCluster().GetLocalNode();
Parallel.ForEach(rows, r =>
{
string key = GetKey(r);
if (aff.IsPrimary(localNode, key))
{
KeyValuePair<string, IBinaryObject> pair = BuildRow(r);
ds.AddData(pair);
}
});
除了使用 IDataStreamer
和 IBinaryObject
来减少 Apache Ignite.NET 的插入时间之外,我还能做些什么?有可能获得显着的性能提升,还是已经达到了预期的效果?
我正在使用:
- .NET
- 41 个查询字段:每行 1 个字符串字段和 40 个浮点字段
IBinaryObject
/WithKeepBinary
IDataStreamer
- 默认 JVM 设置
- 分区缓存
- 没有坚持
这是我对 IDataStreamer
的用法:
using (var ds = m_ignite.GetDataStreamer<string, IBinaryObject>(CacheName)) {
foreach (var binaryRow in rows.Select(r => BuildRow(r))) {
var key = binaryRow.GetField<string>(PrimaryKeyName);
ds.AddData(key, binaryRow);
}
}
性能结果:(5个节点均具有相同规格)
BenchmarkDotNet=v0.10.8, OS=Windows 8.1 (6.3.9600)
Processor=Intel Xeon CPU E5-2698 v4 2.20GHz Intel Xeon CPU E5-2698 v4 2.20GHz, ProcessorCount=4
Frequency=14318180 Hz, Resolution=69.8413 ns, Timer=HPET
[Host] : Clr 4.0.30319.42000, 64bit RyuJIT-v4.7.2053.0
Job-UZDKMF : Clr 4.0.30319.42000, 64bit RyuJIT-v4.7.2053.0
RunStrategy=Monitoring TargetCount=1
NumRows Mean (ms) Per Row (ms/row)
10 359.50* 35.95*
100 465.50* 4.66*
1,000 797.80* 0.80*
10,000 4,479.80 0.45
100,000 37,611.60 0.38
500,000 184,640.00 0.37
1,000,000 366,801.40 0.37
2,000,000 732,562.40 0.37
4,000,000 1,458,913.60 0.36
*Measurement is larger because it also measures some lightweight work before inserting the rows
任何提示、技巧或文档都将受到赞赏。谢谢!
不要调用 GetField 来检索密钥,return 直接从 BuildRow(即 return
KeyValuePair<string, IBinaryObject>
)并行插入(和
BuildRow
调用):Parallel.ForEach(rows, r => { KeyValuePair<string, IBinaryObject> pair = BuildRow(r); ds.AddData(pair); });
运行 更多机器上的更多 Ignite 节点
如果行来自外部数据源,你可以让每个Ignite节点只加载相关部分。您可以通过
ICompute.Broadcast
在每一行上执行 DataStreamer 来做到这一点,并在遍历行时检查键是否属于当前节点:IAffinity aff = m_ignite.GetAffinity(cacheName); IClusterNode localNode = m_ignite.GetCluster().GetLocalNode(); Parallel.ForEach(rows, r => { string key = GetKey(r); if (aff.IsPrimary(localNode, key)) { KeyValuePair<string, IBinaryObject> pair = BuildRow(r); ds.AddData(pair); } });