为什么使用MPJ Express的程序会出现异常?
Why does an exception occurs in the program which used MPJ Express?
有一个程序使用 MPJ Express 将矩阵和向量相乘。矩阵按行划分。但是在处理的时候出现了异常。那我做错了吗?
import java.util.Random;
import mpi.Comm;
import mpi.MPI;
public class Main {
private static final int rootProcessorRank = 0;
private static Comm comunicator;
private static int processorsNumber;
private static int currentProcessorRank;
public static void main(String[] args) {
MPI.Init(args);
comunicator = MPI.COMM_WORLD;
currentProcessorRank = comunicator.Rank();
processorsNumber = comunicator.Size();
if (currentProcessorRank == rootProcessorRank) {
rootProcessorAction();
} else {
notRootProcessorAction();
}
MPI.Finalize();
}
public static void rootProcessorAction() {
int[] matrixVectorSize = new int[] {5};
int[][] matrix = createAndInitMatrix(matrixVectorSize[0]);
int[] vector = createAndInitVector(matrixVectorSize[0]);
for (int i = 1; i < processorsNumber; i++) {
comunicator.Isend(matrixVectorSize, 0, 1, MPI.INT, i, MPI.ANY_TAG);
System.out.println("Proc: " + currentProcessorRank + ", send matrixVectorSize");
comunicator.Isend(vector, 0, vector.length, MPI.INT, i, MPI.ANY_TAG);
System.out.println("Proc: " + currentProcessorRank + ", send vector");
}
int averageRowsPerProcessor = matrix.length / (processorsNumber - 1);
int[] rowsPerProcessor = new int[processorsNumber];
int notDistributedRowsNumber = matrix.length;
for (int i = 1; i < rowsPerProcessor.length; i++) {
if (i == rowsPerProcessor.length - 1) {
rowsPerProcessor[i] = notDistributedRowsNumber;
} else {
rowsPerProcessor[i] = averageRowsPerProcessor;
notDistributedRowsNumber -= averageRowsPerProcessor;
}
}
int offset = 0;
// the processorRows[0] always will be '0'
for (int i = 1; i < rowsPerProcessor.length; i++) {
int[] processorRows = new int[1];
processorRows[0] = rowsPerProcessor[i];
comunicator.Isend(processorRows, 0, 1, MPI.INT, i, MPI.ANY_TAG);
comunicator.Isend(matrix, offset, processorRows[0], MPI.OBJECT, i, MPI.ANY_TAG);
offset += rowsPerProcessor[i];
}
// there will be a code that receive a subRecults from all processes.
}
public static void notRootProcessorAction() {
int[] matrixVectorSize = new int[1];
int[] rowsNumber = new int[1];
int[] vector = null;
int[][] subMatrix = null;
comunicator.Probe(rootProcessorRank, MPI.ANY_SOURCE);
comunicator.Recv(matrixVectorSize, 0, 1, MPI.INT, rootProcessorRank, MPI.ANY_TAG);
System.out.println("Proc: " + currentProcessorRank + ", receive matrixVectorSize");
vector = new int[matrixVectorSize[0]];
comunicator.Probe(rootProcessorRank, MPI.ANY_SOURCE);
comunicator.Recv(vector, 0, vector.length, MPI.INT, rootProcessorRank, MPI.ANY_TAG);
System.out.println("Proc: " + currentProcessorRank + ", receive vector");
comunicator.Probe(rootProcessorRank, MPI.ANY_SOURCE);
comunicator.Recv(rowsNumber, 0, 1, MPI.INT, rootProcessorRank, MPI.ANY_TAG);
System.out.println("Proc: " + currentProcessorRank + ", receive rowsNumber");
subMatrix = new int[rowsNumber[0]][rowsNumber[0]];
comunicator.Probe(rootProcessorRank, MPI.ANY_SOURCE);
comunicator.Recv(subMatrix, 0, subMatrix.length, MPI.OBJECT, rootProcessorRank, MPI.ANY_TAG);
System.out.println("Proc: " + currentProcessorRank + ", receive subMatrix");
int[] result = new int[rowsNumber[0]];
multiplyMatrixVector(subMatrix, vector, result);
comunicator.Send(result, 0, result.length, MPI.INT, rootProcessorRank, MPI.ANY_TAG);
}
private static void multiplyMatrixVector(int[][] matrix, int[] vector, int[] result) {
for (int i = 0; i < matrix.length; i++) {
int summ = 0;
for (int j = 0; j < matrix[i].length; j++) {
summ += matrix[i][j] * vector[j];
}
result[i] = summ;
}
}
private static int[][] createAndInitMatrix(int size) {
int[][] matrix = new int[size][size];
Random random = new Random();
for (int i = 0; i < matrix.length; i++) {
for (int j = 0; j < matrix.length; j++) {
matrix[i][j] = random.nextInt(100);
}
}
return matrix;
}
private static int[] createAndInitVector(int size) {
int[] vector = new int[size];
Random random = new Random();
for (int i = 0; i < vector.length; i++) {
vector[i] = random.nextInt(100);
}
return vector;
}
}
这里有一个例外:
MPJ Express (0.44) is started in the multicore configuration
java.lang.reflect.InvocationTargetException at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606) at
runtime.starter.MulticoreStarter.run(MulticoreStarter.java:281) at
java.lang.Thread.run(Thread.java:745) Caused by: mpi.MPIException:
xdev.XDevException: java.lang.NullPointerException at
mpi.Comm.isend(Comm.java:944) at mpi.Comm.Isend(Comm.java:885) at
Main.rootProcessorAction(Main.java:35) at Main.main(Main.java:20)
... 6 more Caused by: xdev.XDevException:
java.lang.NullPointerException at
xdev.smpdev.SMPDevice.isend(SMPDevice.java:104) at
mpjdev.javampjdev.Comm.isend(Comm.java:1019) at
mpi.Comm.isend(Comm.java:941) ... 9 more Caused by:
java.lang.NullPointerException at
xdev.smpdev.SMPDeviceImpl$SendQueue.add(SMPDeviceImpl.java:930) at
xdev.smpdev.SMPDeviceImpl$SendQueue.add(SMPDeviceImpl.java:909) at
xdev.smpdev.SMPDeviceImpl.isend(SMPDeviceImpl.java:330) at
xdev.smpdev.SMPDevice.isend(SMPDevice.java:101) ... 11 more
xdev.XDevException: java.lang.NullPointerException at
xdev.smpdev.SMPDevice.recv(SMPDevice.java:162)
根据我使用 mpj express 的经验,尽量避免使用常量 MPI.ANY_SOURCE 和 MPI.ANY_TAG。设置你自己的标签和来源,你应该没问题。当我在我的程序中使用这个常量时,有时我会随机崩溃 xDev.xDevException 由空指针引起,有时它 运行 就好了。
这里列出了 mpj express 的内部常量,您也不应将其用作标记,我只显示整数常量:
public static final int mpi.MPI.NUM_OF_PROCESSORS = 4
public static int mpi.MPI.UNDEFINED = -1
public static int mpi.MPI.THREAD_SINGLE = 1
public static int mpi.MPI.THREAD_FUNNELED = 2
public static int mpi.MPI.THREAD_SERIALIZED = 3
public static int mpi.MPI.THREAD_MULTIPLE = 4
public static int mpi.MPI.ANY_SOURCE = -2
public static int mpi.MPI.ANY_TAG = -2
public static int mpi.MPI.PROC_NULL = -3
public static int mpi.MPI.BSEND_OVERHEAD = 0
public static int mpi.MPI.SEND_OVERHEAD = 0
public static int mpi.MPI.RECV_OVERHEAD = 0
public static final int mpi.MPI.IDENT = 0
public static final int mpi.MPI.CONGRUENT = 3
public static final int mpi.MPI.SIMILAR = 1
public static final int mpi.MPI.UNEQUAL = 2
public static int mpi.MPI.GRAPH = 1
public static int mpi.MPI.CART = 2
public static int mpi.MPI.TAG_UB = 0
public static int mpi.MPI.HOST = 0
public static int mpi.MPI.IO = 0
干杯。
有一个程序使用 MPJ Express 将矩阵和向量相乘。矩阵按行划分。但是在处理的时候出现了异常。那我做错了吗?
import java.util.Random;
import mpi.Comm;
import mpi.MPI;
public class Main {
private static final int rootProcessorRank = 0;
private static Comm comunicator;
private static int processorsNumber;
private static int currentProcessorRank;
public static void main(String[] args) {
MPI.Init(args);
comunicator = MPI.COMM_WORLD;
currentProcessorRank = comunicator.Rank();
processorsNumber = comunicator.Size();
if (currentProcessorRank == rootProcessorRank) {
rootProcessorAction();
} else {
notRootProcessorAction();
}
MPI.Finalize();
}
public static void rootProcessorAction() {
int[] matrixVectorSize = new int[] {5};
int[][] matrix = createAndInitMatrix(matrixVectorSize[0]);
int[] vector = createAndInitVector(matrixVectorSize[0]);
for (int i = 1; i < processorsNumber; i++) {
comunicator.Isend(matrixVectorSize, 0, 1, MPI.INT, i, MPI.ANY_TAG);
System.out.println("Proc: " + currentProcessorRank + ", send matrixVectorSize");
comunicator.Isend(vector, 0, vector.length, MPI.INT, i, MPI.ANY_TAG);
System.out.println("Proc: " + currentProcessorRank + ", send vector");
}
int averageRowsPerProcessor = matrix.length / (processorsNumber - 1);
int[] rowsPerProcessor = new int[processorsNumber];
int notDistributedRowsNumber = matrix.length;
for (int i = 1; i < rowsPerProcessor.length; i++) {
if (i == rowsPerProcessor.length - 1) {
rowsPerProcessor[i] = notDistributedRowsNumber;
} else {
rowsPerProcessor[i] = averageRowsPerProcessor;
notDistributedRowsNumber -= averageRowsPerProcessor;
}
}
int offset = 0;
// the processorRows[0] always will be '0'
for (int i = 1; i < rowsPerProcessor.length; i++) {
int[] processorRows = new int[1];
processorRows[0] = rowsPerProcessor[i];
comunicator.Isend(processorRows, 0, 1, MPI.INT, i, MPI.ANY_TAG);
comunicator.Isend(matrix, offset, processorRows[0], MPI.OBJECT, i, MPI.ANY_TAG);
offset += rowsPerProcessor[i];
}
// there will be a code that receive a subRecults from all processes.
}
public static void notRootProcessorAction() {
int[] matrixVectorSize = new int[1];
int[] rowsNumber = new int[1];
int[] vector = null;
int[][] subMatrix = null;
comunicator.Probe(rootProcessorRank, MPI.ANY_SOURCE);
comunicator.Recv(matrixVectorSize, 0, 1, MPI.INT, rootProcessorRank, MPI.ANY_TAG);
System.out.println("Proc: " + currentProcessorRank + ", receive matrixVectorSize");
vector = new int[matrixVectorSize[0]];
comunicator.Probe(rootProcessorRank, MPI.ANY_SOURCE);
comunicator.Recv(vector, 0, vector.length, MPI.INT, rootProcessorRank, MPI.ANY_TAG);
System.out.println("Proc: " + currentProcessorRank + ", receive vector");
comunicator.Probe(rootProcessorRank, MPI.ANY_SOURCE);
comunicator.Recv(rowsNumber, 0, 1, MPI.INT, rootProcessorRank, MPI.ANY_TAG);
System.out.println("Proc: " + currentProcessorRank + ", receive rowsNumber");
subMatrix = new int[rowsNumber[0]][rowsNumber[0]];
comunicator.Probe(rootProcessorRank, MPI.ANY_SOURCE);
comunicator.Recv(subMatrix, 0, subMatrix.length, MPI.OBJECT, rootProcessorRank, MPI.ANY_TAG);
System.out.println("Proc: " + currentProcessorRank + ", receive subMatrix");
int[] result = new int[rowsNumber[0]];
multiplyMatrixVector(subMatrix, vector, result);
comunicator.Send(result, 0, result.length, MPI.INT, rootProcessorRank, MPI.ANY_TAG);
}
private static void multiplyMatrixVector(int[][] matrix, int[] vector, int[] result) {
for (int i = 0; i < matrix.length; i++) {
int summ = 0;
for (int j = 0; j < matrix[i].length; j++) {
summ += matrix[i][j] * vector[j];
}
result[i] = summ;
}
}
private static int[][] createAndInitMatrix(int size) {
int[][] matrix = new int[size][size];
Random random = new Random();
for (int i = 0; i < matrix.length; i++) {
for (int j = 0; j < matrix.length; j++) {
matrix[i][j] = random.nextInt(100);
}
}
return matrix;
}
private static int[] createAndInitVector(int size) {
int[] vector = new int[size];
Random random = new Random();
for (int i = 0; i < vector.length; i++) {
vector[i] = random.nextInt(100);
}
return vector;
}
}
这里有一个例外:
MPJ Express (0.44) is started in the multicore configuration java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at runtime.starter.MulticoreStarter.run(MulticoreStarter.java:281) at java.lang.Thread.run(Thread.java:745) Caused by: mpi.MPIException: xdev.XDevException: java.lang.NullPointerException at mpi.Comm.isend(Comm.java:944) at mpi.Comm.Isend(Comm.java:885) at Main.rootProcessorAction(Main.java:35) at Main.main(Main.java:20) ... 6 more Caused by: xdev.XDevException: java.lang.NullPointerException at xdev.smpdev.SMPDevice.isend(SMPDevice.java:104) at mpjdev.javampjdev.Comm.isend(Comm.java:1019) at mpi.Comm.isend(Comm.java:941) ... 9 more Caused by: java.lang.NullPointerException at xdev.smpdev.SMPDeviceImpl$SendQueue.add(SMPDeviceImpl.java:930) at xdev.smpdev.SMPDeviceImpl$SendQueue.add(SMPDeviceImpl.java:909) at xdev.smpdev.SMPDeviceImpl.isend(SMPDeviceImpl.java:330) at xdev.smpdev.SMPDevice.isend(SMPDevice.java:101) ... 11 more xdev.XDevException: java.lang.NullPointerException at xdev.smpdev.SMPDevice.recv(SMPDevice.java:162)
根据我使用 mpj express 的经验,尽量避免使用常量 MPI.ANY_SOURCE 和 MPI.ANY_TAG。设置你自己的标签和来源,你应该没问题。当我在我的程序中使用这个常量时,有时我会随机崩溃 xDev.xDevException 由空指针引起,有时它 运行 就好了。
这里列出了 mpj express 的内部常量,您也不应将其用作标记,我只显示整数常量:
public static final int mpi.MPI.NUM_OF_PROCESSORS = 4
public static int mpi.MPI.UNDEFINED = -1
public static int mpi.MPI.THREAD_SINGLE = 1
public static int mpi.MPI.THREAD_FUNNELED = 2
public static int mpi.MPI.THREAD_SERIALIZED = 3
public static int mpi.MPI.THREAD_MULTIPLE = 4
public static int mpi.MPI.ANY_SOURCE = -2
public static int mpi.MPI.ANY_TAG = -2
public static int mpi.MPI.PROC_NULL = -3
public static int mpi.MPI.BSEND_OVERHEAD = 0
public static int mpi.MPI.SEND_OVERHEAD = 0
public static int mpi.MPI.RECV_OVERHEAD = 0
public static final int mpi.MPI.IDENT = 0
public static final int mpi.MPI.CONGRUENT = 3
public static final int mpi.MPI.SIMILAR = 1
public static final int mpi.MPI.UNEQUAL = 2
public static int mpi.MPI.GRAPH = 1
public static int mpi.MPI.CART = 2
public static int mpi.MPI.TAG_UB = 0
public static int mpi.MPI.HOST = 0
public static int mpi.MPI.IO = 0
干杯。