信号量延迟比预期快 - 为什么?

Semaphore latency faster than expected - why?

信号量的获取是通过阻塞完成的。根据 internet 和 clockres,Windows 上的中断频率/定时器间隔不应低于 0.5ms。下面的程序测量了不同线程中信号量的释放和获取之间的时间。我不希望这比 0.5ms 快,但我可靠地得到了 ~0.017ms 的结果。 (奇怪的是标准差高达 +- 100%)

要么是我的测量代码有误,要么是我对信号量工作原理的理解不对。是哪个?没有计算均值和标准差的无聊代码的代码:

namespace {
   std::binary_semaphore semaphore{ 0 };
   std::atomic<std::chrono::high_resolution_clock::time_point> t1;
}

auto acquire_and_set_t1() {
   semaphore.acquire(); // this is being measured
   t1.store(std::chrono::high_resolution_clock::now());
}

auto measure_semaphore_latency() -> double {
   std::jthread j(acquire_and_set_t1);
   std::this_thread::sleep_for(5ms); // To make sure thread is running

   // Signal thread and start timing
   const auto t0 = std::chrono::high_resolution_clock::now();
   semaphore.release();

   std::this_thread::sleep_for(5ms); // To make sure thread is done writing t1

   const double ms = std::chrono::duration_cast<std::chrono::nanoseconds>(t1.load() - t0).count() / 1'000'000.0;
   return ms;
}

auto main() -> int {
   std::vector<double> runtimes;
   for (int i = 0; i < 100; ++i)
      runtimes.emplace_back(measure_semaphore_latency());

   const auto& [mean, relative_std] = get_mean_and_std(runtimes);
   std::cout << std::format("mean: {:.3f} ms, +- {:.2f}%\n", mean, 100.0 * relative_std);
}

编辑:windows 计时器分辨率的来源是 https://randomascii.wordpress.com/2020/10/04/windows-timer-resolution-the-great-rule-change/ and ClockRes

你的困惑来自错误的假设,即它起作用了:

According to the internet and clockres, the interrupt frequency / timer interval on Windows shouldn't be under 0.5ms.

抢占式/基于计时器的调度不一定是 OS 将线程分配给 CPU 内核的唯一机会。 Explicit/Manual信令可以绕过

您可以将其视为 std::binary_semaphore::release() 触发调度程序的立即部分 运行,仅针对恰好在同一信号量上具有 std::binary_semaphore::acquire() 的线程。

这就是这里发生的事情。 measure_semaphore_latency() 线程正在被唤醒,并且 可能 release() 调用时立即分配给 CPU 核心,而无需等待下一次调度“循环”。

仍然不能保证 OS 会选择抢占刚刚唤醒的线程的任何内容。这就是您看到的高标准偏差的来源:线程要么立即获得 CPU 时间,要么在稍后的调度周期获得它,没有中间值。

至于为什么我可以如此确信你的测试就是这种情况:通过一些调试和符号加载,我们可以获得以下调用堆栈:

收购方:

    ntdll.dll!00007fffa4510764()    Unknown
    ntdll.dll!00007fffa44d379d()    Unknown
    ntdll.dll!00007fffa44d3652()    Unknown
    ntdll.dll!00007fffa44d3363()    Unknown
    KernelBase.dll!00007fffa225ce9f()   Unknown
>   msvcp140d_atomic_wait.dll!`anonymous namespace'::__crtWaitOnAddress(volatile void * Address, void * CompareAddress, unsigned __int64 AddressSize, unsigned long dwMilliseconds) Line 174    C++
    msvcp140d_atomic_wait.dll!__std_atomic_wait_direct(const void * _Storage, void * _Comparand, unsigned __int64 _Size, unsigned long _Remaining_timeout) Line 234 C++
    ConsoleApplication2.exe!std::_Atomic_wait_direct<unsigned char,char>(const std::_Atomic_storage<unsigned char,1> * const _This, char _Expected_bytes, const std::memory_order _Order) Line 491  C++
    ConsoleApplication2.exe!std::_Atomic_storage<unsigned char,1>::wait(const unsigned char _Expected, const std::memory_order _Order) Line 829 C++
    ConsoleApplication2.exe!std::counting_semaphore<1>::acquire() Line 245  C++
    ConsoleApplication2.exe!acquire_and_set_t1() Line 17    C++
    ConsoleApplication2.exe!std::invoke<void (__cdecl*)(void)>(void(*)() && _Obj) Line 1586 C++
    ConsoleApplication2.exe!std::thread::_Invoke<std::tuple<void (__cdecl*)(void)>,0>(void * _RawVals) Line 56  C++
    ucrtbased.dll!00007fff4c7b542c()    Unknown
    kernel32.dll!00007fffa2857034() Unknown
    ntdll.dll!00007fffa44c2651()    Unknown

发布端:

    ntdll.dll!00007fffa44d2550()    Unknown 
>   msvcp140d_atomic_wait.dll!`anonymous namespace'::__crtWakeByAddressSingle(void * Address) Line 179  C++
    msvcp140d_atomic_wait.dll!__std_atomic_notify_one_direct(const void * _Storage) Line 251    C++
    ConsoleApplication2.exe!std::_Atomic_storage<unsigned char,1>::notify_one() Line 833    C++
    ConsoleApplication2.exe!std::counting_semaphore<1>::release(const __int64 _Update) Line 232 C++
    ConsoleApplication2.exe!measure_semaphore_latency() Line 29 C++
    ConsoleApplication2.exe!main() Line 36  C++
    ConsoleApplication2.exe!invoke_main() Line 79   C++
    ConsoleApplication2.exe!__scrt_common_main_seh() Line 288   C++
    ConsoleApplication2.exe!__scrt_common_main() Line 331   C++
    ConsoleApplication2.exe!mainCRTStartup(void * __formal) Line 17 C++
    kernel32.dll!00007fffa2857034() Unknown
    ntdll.dll!00007fffa44c2651()    Unknown

查看 __crtWakeByAddressSingle()__crtWaitOnAddress() (see on github) we find that the invoked kernel functions are WaitOnAddress() ref and WakeByAddressSingle() ref.

的代码

从该文档中,我们在 WaitOnAddress() 的备注部分找到我们的确认:

WaitOnAddress does not interfere with the thread scheduler.