将子线程同步到父线程管理的原子时间

Question

我正在尝试编写一个模拟，其中不同的线程需要根据由父线程。

 这个想法是让父进程将模拟推进一个时间步长（在本例中为简单起见始终为 1），然后让所有线程独立检查它们是否需要进行计算，一旦它们检查了递减一个原子计数器，等到下一步。我希望在运行这段代码之后，每个线程的计算次数将恰好是模拟的长度（即 10000 步）除以特定于线程的间隔（因此对于线程间隔为 4 的线程应该执行正好 2500次计算。

#include <thread>
#include <iostream>
#include <atomic>

std::atomic<int> simTime;
std::atomic<int> tocalc;
int end = 10000;

void threadFunction(int n);

int main() {
  int nthreads = 4;
  std::thread threads[nthreads];
  for (int ii = 0; ii < nthreads; ii ++) {
    threads[ii] = std::thread(threadFunction, ii+1);
  }

  simTime = 0;
  tocalc = 0;
  while (simTime < end) {
    tocalc = nthreads - 1;
    simTime += 1;
    // do calculation
    while (tocalc > 0) {
      // wait until all the threads have done their calculation
      // or at least checked to see if they need to
    }
  }

  for (int ii = 0; ii < nthreads; ii ++) {
    threads[ii].join();
  }
}

void threadFunction(int n) {
  int prev = simTime;
  int fix = prev;
  int ncalcs = 0;
  while (simTime < end) {
    if (simTime - prev > 0) {
      prev = simTime;
      if (simTime - fix >= n) {
        // do calculation
        ncalcs ++;
        fix = simTime;
      }
      tocalc --;
    }
  }
  std::cout << std::to_string(n)+" {ncalcs} - "+std::to_string(ncalcs)+"\n";
}

然而，输出与预期不一致，一种可能的输出是

2 {ncalcs} - 4992
1 {ncalcs} - 9983
3 {ncalcs} - 3330
4 {ncalcs} - 2448

虽然预期输出是

2 {ncalcs} - 5000
1 {ncalcs} - 10000
3 {ncalcs} - 3333
4 {ncalcs} - 2500

我想知道是否有人知道为什么这种强制线程等待下一步的方法似乎失败了——这可能是我的代码的一个简单问题，还是一个更基本的问题用的方法。感谢任何见解，谢谢。

备注

我使用这种方法是因为我尝试过的其他方法（例如使用 pipes，在每一步加入）的开销非常昂贵，如果线程之间有更便宜的通信方式我我愿意接受这样的建议。

Answer 1

为了扩展注释，将 tocalc 初始化为 nthreads - 1 意味着在一些迭代中，所有 child 线程都将递减tocalc 在 parent 线程评估它之前 - 对 atomic 的读取和写入由内存调度程序处理。所以有时序列可能会

Child 减 1 tocalc，新值为 2
Child 减 3 tocalc，新值为 1
Child 减 4 tocalc，新值为 0
Child 减 2 tocalc，新值为 -1
Parent 评估如果 tocalc > 0、returns 为假 - 模拟进展

其他时候 parent 评估可以安排在最后一个线程递减 tocalc 之前，即

Child 减 1 tocalc，新值为 2
Child 减 3 tocalc，新值为 1
Child 减 4 tocalc，新值为 0
Parent 评估如果 tocalc > 0、returns 为假 - 模拟进展
Child 减 2 tocalc，新值为 2

在这种情况下，child 线程编号 2 将错过一次迭代。由于调度顺序的 semi-randomness，这种情况不会每次都发生，因此未命中总数不是线程数的线性函数，而是总迭代次数的一小部分。如果您将代码修改为以下，它将产生所需的结果。

#include <thread>
#include <iostream>
#include <atomic>

std::atomic<int> simTime;
std::atomic<int> tocalc;
int end = 10000;

void threadFunction(int n);

int main() {
    int nthreads = 4;
    simTime = 0;
    tocalc = 0;
    std::thread threads[nthreads];
    for (int ii = 0; ii < nthreads; ii ++) {
        threads[ii] = std::thread(threadFunction, ii+1);
    }

    int wait = 0;
    while (simTime < end) {
        tocalc = nthreads;
        simTime += 1;
        // do calculation
        while (tocalc > 0) {
            // wait until all the threads have done their calculation
            // or at least checked to see if they need to
        }
    }
    for (int ii = 0; ii < nthreads; ii ++) {
        threads[ii].join();
    }
}

void threadFunction(int n) {
    int prev = 0;
    int fix = prev;
    int ncalcs = 0;
    while (simTime < end) {
        if (simTime - prev > 0) {
            prev = simTime;
            if (simTime - fix >= n) {
                // do calculation
                ncalcs ++;
                fix = simTime;
            }
            tocalc --;
        }
    }
    std::cout << std::to_string(n)+" {ncalcs} - "+std::to_string(ncalcs)+"\n";
}

一种可能的输出是（线程完成的顺序有点随机）

2 {ncalcs} - 5000
3 {ncalcs} - 3333
1 {ncalcs} - 10000
4 {ncalcs} - 2500

Answer 2

使用类似的设置，我注意到并非每个线程都会达到您期望的数量，但只会差一个。即

2 {ncalcs} - 4999
4 {ncalcs} - 2500
1 {ncalcs} - 9999
3 {ncalcs} - 3333

等等，关于它发生的线程和线程数似乎是随机的。虽然我不确定是什么原因造成的，但我认为发出警告可能会很好，您可以通过检查是否 simTime - fix == 0 来绕过它，如果不是，则在退出之前再进行一次计算。

将子线程同步到父线程管理的原子时间

Synchronizing child threads to atomic time managed by parent

c++

simulation

optimization

multithreading

thread-synchronization

备注