Storm 应该喷出 sleep() 还是 yield()?

Should Storm spouts sleep() or yield()?

nextTuple() 的 Storm 文档说明如下:

When there are no tuples to emit, it is courteous to have nextTuple sleep for a short amount of time (like a single millisecond) so as not to waste too much CPU.

Utils.class 中似乎有一种方法:Utils.sleep(long millis)

但是,在 Apache Storm 本身提供的一个喷口中,MqttSpout,使用了一种不同的方法:

public void nextTuple() {
    AckableMessage tm = this.incoming.poll();
    if(tm != null){
        ...
    } else {
        Thread.yield();
    }
}

我怀疑 Storm 作者可能在那里犯了一个错误,因为 Thread.yield() 本身在文档中有以下注释:

A hint to the scheduler that the current thread is willing to yield its current use of a processor. The scheduler is free to ignore this hint.

It is rarely appropriate to use this method.

那么我应该使用哪一个呢?我怀疑使用 Thread.yield() 会导致不必要的 CPU 使用。

你的嘴根本不应该睡觉。如果您在调用期间不发出任何内容,Storm 将处理调用 nextTuple 之间的休眠,至少在我熟悉的版本中,即 1.0.0 和更高版本。

参见https://github.com/apache/storm/blob/v1.2.2/storm-core/src/clj/org/apache/storm/daemon/executor.clj#L667 for reference. The default implementation of the wait strategy sleeps for a configurable interval every time it is called (default 1ms). You can control the interval with https://github.com/apache/storm/blob/v1.2.2/storm-core/src/jvm/org/apache/storm/Config.java#L1886 or replace the wait strategy entirely with https://github.com/apache/storm/blob/v1.2.2/storm-core/src/jvm/org/apache/storm/Config.java#L1879

Storm 2.0.0 的行为略有不同(睡眠时间逐渐延长),但基本思想相同。

我认为 nextTuple 的 javadoc 具有误导性,因此我们或许应该修改它。我也不确定 Thread.yield 在 mqtt spout 中做了什么。看起来它自从添加喷口后就一直存在。如果您在其中一个邮件列表 (https://storm.apache.org/getting-help.html) 上询问,作者仍然存在并且可能知道它为什么在那里。

如果您愿意,可以在 https://issues.apache.org/jira/secure/Dashboard.jspa 提出问题来解决此问题:)