服务时间与线程数成正比
Service times directly proportional to number of threads
我的系统是 i5-双核超线程。 Windows 显示 4 个处理器。当我一次 运行 单个线程优化的 cpu 绑定任务时,它的服务时间总是显示 35 毫秒左右。但是当我同时将 2 个任务移交给 2 个线程时,它们的服务时间显示为 70 毫秒左右。我想问一下,我的系统有 4 个处理器,那么为什么在 2 个线程 运行 执行任务的情况下服务时间约为 70,而 2 个线程应该 运行 在 2 个处理器上没有任何调度 overhead.The代码如下
CPU-绑定任务如下
import java.math.BigInteger;
public class CpuBoundJob implements Runnable {
public void run() {
BigInteger factValue = BigInteger.ONE;
long t1=System.nanoTime();
for ( int i = 2; i <= 2000; i++){
factValue = factValue.multiply(BigInteger.valueOf(i));
}
long t2=System.nanoTime();
System.out.println("Service Time(ms)="+((double)(t2-t1)/1000000));
}
}
运行任务的线程如下。
public class TaskRunner extends Thread {
CpuBoundJob job=new CpuBoundJob();
public void run(){
job.run();
}
}
最后,主要class如下
public class Test2 {
int numberOfThreads=100;//warmup code for JIT
public Test2(){
for(int i=1;i<=numberOfThreads;i++){//warmup code for JIT
TaskRunner t=new TaskRunner();
t.start();
}
try{
Thread.sleep(5000);// wait a little bit
}catch(Exception e){}
System.out.println("Warmed up completed! now start benchmarking");
System.out.println("First run single thread at a time");
try{//wait for the thread to complete
Thread.sleep(5000);
}catch(Exception e){}
//run only one thread at a time
TaskRunner t1=new TaskRunner();
t1.start();
try{//wait for the thread to complete
Thread.sleep(5000);
}catch(Exception e){}
//Now run 2 threads simultanously at a time
System.out.println("Now run 3 thread at a time");
for(int i=1;i<=3;i++){//run 2 thread at a time
TaskRunner t2=new TaskRunner();
t2.start();
}
}
public static void main(String[] args) {
new Test2();
}
最终输出:
Warmed up completed! now start benchmarking First run single thread at
a time Service Time(ms)=5.829112 Now run 2 thread at a time Service
Time(ms)=6.518721 Service Time(ms)=10.364269 Service
Time(ms)=10.272689
我在各种场景中对此进行了计时,并稍微修改了任务,一个线程的时间约为 45 毫秒,两个线程的时间约为 60 毫秒。所以,即使在这个例子中,在一秒钟内,一个线程可以完成大约 22 个任务,但是两个线程可以完成 33 个任务。
但是,如果您 运行 一项任务不会对垃圾收集器造成如此严重的负担,您应该会看到预期的性能提升:两个线程完成两倍的任务。这是我的测试程序版本。
请注意,我对您的任务做了一个重大更改 (DirtyTask
):n
始终为 0,因为您将 Math.random()
的结果转换为 int
(为零),然后 然后 乘以 13.
然后我添加了一个 CleanTask
,它不会生成任何新对象供垃圾收集器处理。请在您的机器上测试并报告结果。在我的身上,我得到了这个:
Testing "clean" task.
Average task time: one thread = 46 ms; two threads = 45 ms
Testing "dirty" task.
Average task time: one thread = 41 ms; two threads = 62 ms
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.ThreadLocalRandom;
import java.util.concurrent.TimeUnit;
import java.util.function.Supplier;
final class Parallels
{
private static final int RUNS = 10;
public static void main(String... argv)
throws Exception
{
System.out.println("Testing \"clean\" task.");
flavor(CleanTask::new);
System.out.println("Testing \"dirty\" task.");
flavor(DirtyTask::new);
}
private static void flavor(Supplier<Callable<Long>> tasks)
throws InterruptedException, ExecutionException
{
ExecutorService warmup = Executors.newFixedThreadPool(100);
for (int i = 0; i < 100; ++i)
warmup.submit(tasks.get());
warmup.shutdown();
warmup.awaitTermination(1, TimeUnit.DAYS);
ExecutorService workers = Executors.newFixedThreadPool(2);
long t1 = test(1, tasks, workers);
long t2 = test(2, tasks, workers);
System.out.printf("Average task time: one thread = %d ms; two threads = %d ms%n", t1 / (1 * RUNS), t2 / (2 * RUNS));
workers.shutdown();
}
private static long test(int n, Supplier<Callable<Long>> tasks, ExecutorService workers)
throws InterruptedException, ExecutionException
{
long sum = 0;
for (int i = 0; i < RUNS; ++i) {
List<Callable<Long>> batch = new ArrayList<>(n);
for (int t = 0; t < n; ++t)
batch.add(tasks.get());
List<Future<Long>> times = workers.invokeAll(batch);
for (Future<Long> f : times)
sum += f.get();
}
return TimeUnit.NANOSECONDS.toMillis(sum);
}
/**
* Do something on the CPU without creating any garbage, and return the
* elapsed time.
*/
private static class CleanTask
implements Callable<Long>
{
@Override
public Long call()
{
long time = System.nanoTime();
long x = 0;
for (int i = 0; i < 15_000_000; i++)
x ^= ThreadLocalRandom.current().nextLong();
if (x == 0)
throw new IllegalStateException();
return System.nanoTime() - time;
}
}
/**
* Do something on the CPU that creates a lot of garbage, and return the
* elapsed time.
*/
private static class DirtyTask
implements Callable<Long>
{
@Override
public Long call()
{
long time = System.nanoTime();
String s = "";
for (int i = 0; i < 10_000; i++)
s += (int) (ThreadLocalRandom.current().nextDouble() * 13);
if (s.length() == 10_000)
throw new IllegalStateException();
return System.nanoTime() - time;
}
}
}
for(int i=0;i<10000;i++)
{
int n=(int)Math.random()*13;
s+=name.valueOf(n);
//s+="*";
}
此代码是围绕一次只能由一个线程访问的资源的紧密旋转。所以每个线程只需要等待另一个线程释放随机数生成器,以便它可以访问它。
正如 docs for Math.random
所说:
When this method is first called, it creates a single new pseudorandom-number generator, exactly as if by the expression
new java.util.Random()
This new pseudorandom-number generator is used thereafter for all calls to this method and is used nowhere else.
This method is properly synchronized to allow correct use by more than one thread. However, if many threads need to generate pseudorandom numbers at a great rate, it may reduce contention for each thread to have its own pseudorandom-number generator.
我的系统是 i5-双核超线程。 Windows 显示 4 个处理器。当我一次 运行 单个线程优化的 cpu 绑定任务时,它的服务时间总是显示 35 毫秒左右。但是当我同时将 2 个任务移交给 2 个线程时,它们的服务时间显示为 70 毫秒左右。我想问一下,我的系统有 4 个处理器,那么为什么在 2 个线程 运行 执行任务的情况下服务时间约为 70,而 2 个线程应该 运行 在 2 个处理器上没有任何调度 overhead.The代码如下
CPU-绑定任务如下
import java.math.BigInteger;
public class CpuBoundJob implements Runnable {
public void run() {
BigInteger factValue = BigInteger.ONE;
long t1=System.nanoTime();
for ( int i = 2; i <= 2000; i++){
factValue = factValue.multiply(BigInteger.valueOf(i));
}
long t2=System.nanoTime();
System.out.println("Service Time(ms)="+((double)(t2-t1)/1000000));
}
}
运行任务的线程如下。
public class TaskRunner extends Thread {
CpuBoundJob job=new CpuBoundJob();
public void run(){
job.run();
}
}
最后,主要class如下
public class Test2 {
int numberOfThreads=100;//warmup code for JIT
public Test2(){
for(int i=1;i<=numberOfThreads;i++){//warmup code for JIT
TaskRunner t=new TaskRunner();
t.start();
}
try{
Thread.sleep(5000);// wait a little bit
}catch(Exception e){}
System.out.println("Warmed up completed! now start benchmarking");
System.out.println("First run single thread at a time");
try{//wait for the thread to complete
Thread.sleep(5000);
}catch(Exception e){}
//run only one thread at a time
TaskRunner t1=new TaskRunner();
t1.start();
try{//wait for the thread to complete
Thread.sleep(5000);
}catch(Exception e){}
//Now run 2 threads simultanously at a time
System.out.println("Now run 3 thread at a time");
for(int i=1;i<=3;i++){//run 2 thread at a time
TaskRunner t2=new TaskRunner();
t2.start();
}
}
public static void main(String[] args) {
new Test2();
}
最终输出:
Warmed up completed! now start benchmarking First run single thread at a time Service Time(ms)=5.829112 Now run 2 thread at a time Service Time(ms)=6.518721 Service Time(ms)=10.364269 Service Time(ms)=10.272689
我在各种场景中对此进行了计时,并稍微修改了任务,一个线程的时间约为 45 毫秒,两个线程的时间约为 60 毫秒。所以,即使在这个例子中,在一秒钟内,一个线程可以完成大约 22 个任务,但是两个线程可以完成 33 个任务。
但是,如果您 运行 一项任务不会对垃圾收集器造成如此严重的负担,您应该会看到预期的性能提升:两个线程完成两倍的任务。这是我的测试程序版本。
请注意,我对您的任务做了一个重大更改 (DirtyTask
):n
始终为 0,因为您将 Math.random()
的结果转换为 int
(为零),然后 然后 乘以 13.
然后我添加了一个 CleanTask
,它不会生成任何新对象供垃圾收集器处理。请在您的机器上测试并报告结果。在我的身上,我得到了这个:
Testing "clean" task. Average task time: one thread = 46 ms; two threads = 45 ms Testing "dirty" task. Average task time: one thread = 41 ms; two threads = 62 ms
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.ThreadLocalRandom;
import java.util.concurrent.TimeUnit;
import java.util.function.Supplier;
final class Parallels
{
private static final int RUNS = 10;
public static void main(String... argv)
throws Exception
{
System.out.println("Testing \"clean\" task.");
flavor(CleanTask::new);
System.out.println("Testing \"dirty\" task.");
flavor(DirtyTask::new);
}
private static void flavor(Supplier<Callable<Long>> tasks)
throws InterruptedException, ExecutionException
{
ExecutorService warmup = Executors.newFixedThreadPool(100);
for (int i = 0; i < 100; ++i)
warmup.submit(tasks.get());
warmup.shutdown();
warmup.awaitTermination(1, TimeUnit.DAYS);
ExecutorService workers = Executors.newFixedThreadPool(2);
long t1 = test(1, tasks, workers);
long t2 = test(2, tasks, workers);
System.out.printf("Average task time: one thread = %d ms; two threads = %d ms%n", t1 / (1 * RUNS), t2 / (2 * RUNS));
workers.shutdown();
}
private static long test(int n, Supplier<Callable<Long>> tasks, ExecutorService workers)
throws InterruptedException, ExecutionException
{
long sum = 0;
for (int i = 0; i < RUNS; ++i) {
List<Callable<Long>> batch = new ArrayList<>(n);
for (int t = 0; t < n; ++t)
batch.add(tasks.get());
List<Future<Long>> times = workers.invokeAll(batch);
for (Future<Long> f : times)
sum += f.get();
}
return TimeUnit.NANOSECONDS.toMillis(sum);
}
/**
* Do something on the CPU without creating any garbage, and return the
* elapsed time.
*/
private static class CleanTask
implements Callable<Long>
{
@Override
public Long call()
{
long time = System.nanoTime();
long x = 0;
for (int i = 0; i < 15_000_000; i++)
x ^= ThreadLocalRandom.current().nextLong();
if (x == 0)
throw new IllegalStateException();
return System.nanoTime() - time;
}
}
/**
* Do something on the CPU that creates a lot of garbage, and return the
* elapsed time.
*/
private static class DirtyTask
implements Callable<Long>
{
@Override
public Long call()
{
long time = System.nanoTime();
String s = "";
for (int i = 0; i < 10_000; i++)
s += (int) (ThreadLocalRandom.current().nextDouble() * 13);
if (s.length() == 10_000)
throw new IllegalStateException();
return System.nanoTime() - time;
}
}
}
for(int i=0;i<10000;i++)
{
int n=(int)Math.random()*13;
s+=name.valueOf(n);
//s+="*";
}
此代码是围绕一次只能由一个线程访问的资源的紧密旋转。所以每个线程只需要等待另一个线程释放随机数生成器,以便它可以访问它。
正如 docs for Math.random
所说:
When this method is first called, it creates a single new pseudorandom-number generator, exactly as if by the expression
new java.util.Random()
This new pseudorandom-number generator is used thereafter for all calls to this method and is used nowhere else.
This method is properly synchronized to allow correct use by more than one thread. However, if many threads need to generate pseudorandom numbers at a great rate, it may reduce contention for each thread to have its own pseudorandom-number generator.