为什么即使服务器是单线程的，Java 客户端上的多线程 RMI 速度更快？

Question

最近我一直在对我的 RMI 程序进行大量测量，以确定我的程序中最昂贵的操作是什么（例如，编组对象、运行方法等）。基本上，从下面的代码中可以看出，我有一个浮点数组，它可以作为参数传递给三个远程操作：幂（所有元素的），登录（所有元素的）任何基数，并求和一个偏移量（到所有元素）。数组的大小为N=10^8.

我的客户端是多线程的（K 个线程），会将数组划分为 N/K 并将每个块传递给一个线程，同时每个块都会调用一个 RMI 调用。服务器是纯单线程的。客户端和服务器运行在同一台机器上。

对于客户端线程数 K=1、2、4、8、16、32，这些方法中的每一个所花费的时间 return 如下（秒 - 10 次迭代采样 -机器：i7 四核（8 个逻辑处理器））：

任意底数的对数（对 Math.log 的 2 次调用）：
- K=1 -> 7.306161
- K=2 -> 3.698500
- K=4 -> 2.788655
- K=8 -> 2.679441 （最佳）
- K=16 -> 2.754160
- K=32 -> 2.812091
Sum Offset（简单求和，不调用其他方法）：
- K=1 -> 3.573020
- K=2 -> 1.864782 （最佳）
- K=4 -> 1.874423
- K=8 -> 2.455411
- K=16 -> 2.752766
- K=32 -> 2.695977

我还测量了每个方法中我的 CPU 使用情况：对于添加偏移量，大多数情况下 CPU 使用率约为 60%，而对数方法调用需要超过 80% CPU，多次达到 100% 的峰值。我还尝试了 power 方法（从下面的代码中抽象出来），它显示出与添加偏移量非常相似的结果（只是稍微贵一点）。

很容易得出结论，添加偏移量非常便宜，因此由于线程调度等原因，处理更多线程只会使成本更高。而且，由于计算对数的成本更高，更多线程会使问题更快，这就是为什么 K=8 非常适合我的 8 CPUs.

机器

但是，服务器是单线程的！这怎么可能？在这种情况下，8 个客户端线程如何比 2 个客户端线程做得更好？

我在考虑这些结果时遇到了很多麻烦。任何帮助表示赞赏。每个模块的代码如下所示。

Server.java的代码：

package rmi;

import java.rmi.Naming;
import java.rmi.Remote;
import java.rmi.registry.LocateRegistry;

public class Server {

    Server(){
        try {
            System.setProperty("java.rmi.server.hostname", "localhost");
            LocateRegistry.createRegistry(1099);
            Service s = new ServiceImple();
            Naming.bind("Service", (Remote) s);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args){
        new Server();
    }
}

Client.java的代码：

package rmi;

import java.rmi.Naming;
import java.util.ArrayList;

public class Client {

    private static final int N = 100000000;
    private static final int K = 64;
    private static final int iterations = 1;



    public static void main(String[] args) throws InterruptedException {

        //Variable to hold current pseudo-random number:
        final int a = 25173;
        final int b = 13849;
        final int m = 3276;

        int x = m/2;

        //Create a list of lists:
        ArrayList<float[]> vector = new ArrayList<>();

        for (int i=0; i<K; i++){
            vector.add(new float[N/K]);
            for (int j=0; j<N/K; j++){
                x = (a * x + b) % m;
                vector.get(i)[j] = (float) x/m;
            }
        }

        long startTime = System.nanoTime();

        for (int count=0; count<iterations; count++){

            //Creates the list of threads:
            ArrayList<ClientThread> threads = new ArrayList<>();

            //Starts the threads
            for (int i=0; i<K; i++){
                threads.add(new ClientThread(vector.get(i), N/K));
                threads.get(i).start();
            }

            //Waits for threads to end:
            for (int i=0; i<K; i++){
                threads.get(i).join();
            }

        }

        long estimatedTime = System.nanoTime() - startTime;

        estimatedTime /= iterations;

        System.out.println("Each loop took: "+(float)estimatedTime/1000000000);
    }

}


class ClientThread extends Thread{

    private float[] vector;
    private int vSize;

    public ClientThread(float[] vectorArg, int initSize){
        vector = vectorArg;
        vSize = initSize;
    }

    @Override
    public void run(){
        try {
            Service s = (Service) Naming.lookup("rmi://localhost:1099/Service");

            //Calculates log in RMI:
            //vector = (float[]) s.log(vector, vSize, 2);

            //Adds an offset in RMI:
            vector = (float[]) s.addOffset(vector, vSize, 100);

        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Service.java的代码：

package rmi;

import java.rmi.Remote;
import java.rmi.RemoteException;
import java.util.List;

public interface Service extends Remote {

    //Return log in parameter base of all elements in vector:
    public float[] log(float[] vector, int vSize, int base) throws RemoteException;

    //Adds an offset to all elements in vector:
    public float[] addOffset(float[] vector, int vSize, int offset) throws RemoteException;
}

ServiceImple.java的代码：

package rmi;

import java.rmi.RemoteException;
import java.rmi.server.UnicastRemoteObject;
import java.util.List;


public class ServiceImple extends UnicastRemoteObject implements Service{

    private static final long serialVersionUID = 1L;

    protected ServiceImple() throws RemoteException {
        super();
    }


    @Override
    public float[] log(float[] vector, int vSize, int base) throws RemoteException {
        for (int i=0; i<vSize; i=i+1){
            vector[i] = (float) (Math.log(vector[i])/Math.log(base));
        }
        return vector;
    }

    @Override
    public float[] addOffset(float[] vector, int vSize, int offset) throws RemoteException {
        for (int i=0; i<vSize; i=i+1){
            vector[i] = vector[i] + offset;
        }
        return vector;
    }
}

Answer 1

RMI specification 3.2 状态：

A method dispatched by the RMI runtime to a remote object implementation may or may not execute in a separate thread. The RMI runtime makes no guarantees with respect to mapping remote object invocations to threads.

所以至少不能保证RMI请求在一个线程中执行。我们在实践中看到的是标准 java RMI 实现使用多个线程来处理请求。

Answer 2

Why is Java RMI with multiple threads on client faster even though server is single-threaded?

因为你的假设是错误的。 RMI 服务器不是单线程的。

为什么即使服务器是单线程的，Java 客户端上的多线程 RMI 速度更快？

Why is Java RMI with multiple threads on client faster even though server is single-threaded?

java

multithreading

rmi