CPU 使用等待 std::future wait() 返回的函数或在循环中检查标志睡眠一段时间哪个更好?

which is better for CPU usage waiting for a function returned with std::future wait() or check a flag sleep for a while in a loop?

Q1:while循环中future wait()和check flag哪个占用的CPU少?

std::atomic_bool isRunning{false};

void foo(){
    isRunning.store(true);
    doSomethingTimeConsuming();
    isRunning.store(false);
}

std::future f = std::async(std::launch::async, foo);

使用std::futurewait():

if(f.vaild())
   f.wait()

在 while 循环中检查标志:

if(f.valid){
    while(isRunning.load())
       std::this_thread::sleep_for(1ms);
}

Q2:结论是否也适用于 std::thread.join() 或 std::condition_variable.wait() ?

提前致谢。

std::this_thread::sleep_for 一直在错误的时间不必要地唤醒线程。结果准备就绪且等待线程注意到它的平均延迟是 sleep_for 超时的一半。

std::future::wait 效率更高,因为它会在内核中阻塞直到结果准备好,而不会不必要地进行多个系统调用,这与 std::this_thread::sleep_for.

不同

如果你的运行这两个版本跟

void doSomethingTimeConsuming() {
    std::this_thread::sleep_for(1s);
}

perf stat下,std::future::wait的结果是:

          1.803578      task-clock (msec)         #    0.002 CPUs utilized          
                 2      context-switches          #    0.001 M/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               116      page-faults               #    0.064 M/sec                  
         6,356,215      cycles                    #    3.524 GHz                    
         4,511,076      instructions              #    0.71  insn per cycle         
           835,604      branches                  #  463.304 M/sec                  
            22,313      branch-misses             #    2.67% of all branches        

而对于 std::this_thread::sleep_for(1ms)

         11.715249      task-clock (msec)         #    0.012 CPUs utilized          
               901      context-switches          #    0.077 M/sec                  
                 6      cpu-migrations            #    0.512 K/sec                  
               118      page-faults               #    0.010 M/sec                  
        40,177,222      cycles                    #    3.429 GHz                      
        25,401,055      instructions              #    0.63  insn per cycle
         2,286,806      branches                  #  195.199 M/sec  
           156,400      branch-misses             #    6.84% of all branches        

即在这个特定的测试中,sleep_for 燃烧大约是 CPU 周期的 6 倍。


请注意 isRunning.load()isRunning.store(true) 之间存在竞争条件。解决方法是初始化 isRunning{true};.