为什么所有线程总是占用相同的时间，为什么OS 运行所有线程都"in parallel"？

Question

为什么每个线程的时间量总是相等，所有线程都执行相同的指令？

我已经设置了一个程序来测量线程执行特定操作所花费的时间（有关更多详细信息，请参见下文）。我的 MacBook 有 4 个物理内核或 8 个逻辑内核（可用超线程），所以我假设我可以运行最多并行 8 个线程。

我希望 OS 到运行最多并行 8 个线程，一旦完成，运行其余线程。但是，与这些预期相反，所有线程（无论多少）似乎总是花费相同的时间。为什么 OS 将它们分开，为什么它启动所有线程而不是先完成前 8 个线程？为什么它尝试运行所有这些“并行”（我知道它实际上不能运行并行超过 8 个）？

另请参阅下面的 table；每个条代表 n 个线程的平均时间，n 从 1 到 17。令人惊讶的是，平均时间等于每个线程的时间（这就是我发现的 odd/don 不明白 - 为什么它们都花费相同的时间？）：

这是我使用的一段代码：

#include <stdio.h>
#include <sys/time.h>
#include <pthread.h>

#define MAX 1000000000

void * calculate (void * val) {
  /* only time measurement ... */
  long a = 0, start, end;
  struct timeval timecheck;

  gettimeofday(&timecheck, NULL);
  start = (long)timecheck.tv_sec * 1000 + (long)timecheck.tv_usec / 1000;

  /* actual calculation (just for testing) */
  for (unsigned i = 1; i <= MAX; ++i)
    a += i;

  gettimeofday(&timecheck, NULL);
  end = (long)timecheck.tv_sec * 1000 + (long)timecheck.tv_usec / 1000;

  printf("%ld; time: %ldms\n", a, (end - start));

  return NULL;
}


void main (int argc, char ** argv) {
  pthread_t thread_1, thread_2, thread_3, thread_4; /* thread_5, ...*/
  long a = 0, start, end;
  struct timeval timecheck;

  gettimeofday(&timecheck, NULL);
  start = (long)timecheck.tv_sec * 1000 + (long)timecheck.tv_usec / 1000;

  if (pthread_create(&thread_1, NULL, &calculate, NULL) != 0)
    perror("Couldn't create thread 1");
  /* ... repeat the above two lines for each thread ... */

  pthread_join(thread_1, NULL);
  /* ... repeat the line above for each thread ... */

  gettimeofday(&timecheck, NULL);
  end = (long)timecheck.tv_sec * 1000 + (long)timecheck.tv_usec / 1000;

  printf("total time: %ldms\n", (end - start));
}

Answer 1

why does it start all of the threads instead of finishing say the first 8 threads first?

您要求它创建并启动 17 个线程。所以它创建并启动了 17 个线程。

它无法知道在安排其他任务之前完成运行 8 是否安全。它甚至不知道这将是有益的。（想象一下，如果线程大部分时间都处于阻塞状态。）但最重要的是，它甚至不知道线程是否会完成！

也许您想使用线程池模型，在该模型中，您从共享队列中获取的线程数量有限。

Example of a thread pool (in C)

线程池示例（在 Perl 中）：

use threads;

use Thread::Queue qw( );

use constant NUM_WORKERS => 10;

sub work {
   my $job = shift;
   ...
}

my $q = Thread::Queue->new();  # A thread-safe queue.

# Create worker threads.
my @threads;
for (1..NUM_WORKERS) {
   push @threads, async {
      while ( defined( my $job = $q->dequeue() ) ) {
         work($job);
      }
   };
}

# Feed them work
for my $job (...) {
    $q->enqueue($job);
}

# $q->dequeue normally blocks when the queue is empty.
# This causes it to return an undef value instead.
$q->end();  

# Wait for the workers to complete the work.
$_->join() for @threads;

有适用于 C 的线程池库。（就此而言，也适用于 Perl。）

为什么所有线程总是占用相同的时间，为什么OS 运行所有线程都"in parallel"？

Why do all threads always take the same time, why does the OS run all threads "in parallel"?

c

macos

multithreading

posix

pthreads

为什么所有线程总是占用相同的时间，为什么OS 运行 所有线程都"in parallel"？

Why do all threads always take the same time, why does the OS run all threads "in parallel"?

c

macos

multithreading

posix

pthreads

为什么所有线程总是占用相同的时间，为什么OS 运行所有线程都"in parallel"？