Azure Service Fabric - 分布式计算代码示例 Monte Carlo 模拟 - 性能问题

Azure Service Fabric - Distributed computation code sample Monte Carlo simulation - performance issues

听了最近的 azure 播客（尤其是关于在 azure 上构建低延迟金融系统的播客）并阅读了所有关于 Service Fabric 的炒作，我决定尝试根据我的需要改变 'Distributed computation code sample Monte Carlo simulation' 模式。

我的场景是：一个具有给定起始状态的请求运行 10k 完整的体育比赛模拟使用一个简单的（计算明智的）基于蒙特卡洛的模型。

我的第一次尝试是：

1 * Stateful 'Processor' 接收匹配开始状态并将其转发给 10k + Task Actor 的 Actor，以及相关的聚合器 ActorId
- 10K+ * StateLess 'Task' Actor 运行 1 模拟并将结果传递给他们的 Aggregator Actor。仿真时间很短 (~2ms)
- 100 * 有状态 'Aggregator' 聚合的 Actor 接收模拟并传递给终结器 Actor
- 1 * 'Finaliser'计算最终结果的Actor

运行在我的开发箱上，仅使用 Tasks 的上述设置花费了 < 100 毫秒，但上述设置（运行在开发机器上作为本地集群）花费了 50 秒甚至更多！

在通过一个潜在原因进行调试后，我发现处理器 Actor 发送初始任务所花费的时间量，所以我想知道调用 Service Fabric 有什么样的开销（我猜各种各样的当我调用演员的方法时，会发生命名服务调用），缓慢是否可能是由于这个和我的任务数量造成的？

为了排除其他可能性，我做了以下操作，发现总时间只有很小的差异：

使所有参与者无状态，以确保状态管理不会增加开销。
在处理器中创建所有 ActorProxies 并存储它们的引用以供将来调用以确保 Actor 激活不会导致问题。

有没有人对从这里到哪里有任何建议，或者有没有人尝试实施类似的东西？

谢谢，亚历克斯

我会把它作为评论发布，但我还没有足够的声誉！如果您引用 this page in Service Fabric's documentation, take a look at the comments below the article, particularly the comment trail started by "tom" sometime around June, 2015. He was experiencing poor performance (~20 operations per second) with stateful actors, which seemed to be acknowledged as an area of future improvement. They stressed the use of readonly attributes on non-mutating methods to significantly improve performance. Abhishek Ram also included some notes and a link to information on relevant performance counters 可能有助于故障排除。

您注意到您尝试使用对性能影响很小的无状态 actor。我会进一步指出评论线索，其中另一个用户报告使用只读方法在单个 actor 上实现每秒 2k+ 次操作，我希望它的执行类似于无状态 actor 方法。也许可以将性能计数器的信息与此进行比较，以了解您的性能与他们在评论中的一些微不足道的示例的匹配程度。

Azure Service Fabric - 分布式计算代码示例 Monte Carlo 模拟 - 性能问题

Azure Service Fabric - Distributed computation code sample Monte Carlo simulation - performance issues

c#

azure

actor

azure-service-fabric