GPU 上的自定义缩减与 CPU 产生不同的结果
Custom reduction on GPU vs CPU yield different result
为什么我在 GPU 上看到的结果与顺序 CPU 不同?
import numpy
from numba import cuda
from functools import reduce
A = (numpy.arange(100, dtype=numpy.float64)) + 1
cuda.reduce(lambda a, b: a + b * 20)(A)
# result 12952749821.0
reduce(lambda a, b: a + b * 20, A)
# result 100981.0
import numba
numba.__version__
# '0.34.0+5.g1762237'
使用 Java 流 API 并行化缩减 CPU 时会发生类似的行为:
int n = 10;
float inputArray[] = new float[n];
ArrayList<Float> inputList = new ArrayList<Float>();
for (int i=0; i<n; i++)
{
inputArray[i] = i+1;
inputList.add(inputArray[i]);
}
Optional<Float> resultStream = inputList.stream().parallel().reduce((x, y) -> x+y*20);
float sequentialResult = array[0];
for (int i = 1; i < array.length; i++)
{
sequentialResult = sequentialResult + array[i] * 20;
}
System.out.println("Sequential Result "+sequentialResult);
// Sequential Result 10541.0
System.out.println("Stream Result "+resultStream.get());
// Stream Result 1.2466232E8
看来,正如 Numba's team, lambda a, b: a + b * 20
isn't associative and commutative 缩减函数所指出的那样, 产生了这个意想不到的结果。
为什么我在 GPU 上看到的结果与顺序 CPU 不同?
import numpy
from numba import cuda
from functools import reduce
A = (numpy.arange(100, dtype=numpy.float64)) + 1
cuda.reduce(lambda a, b: a + b * 20)(A)
# result 12952749821.0
reduce(lambda a, b: a + b * 20, A)
# result 100981.0
import numba
numba.__version__
# '0.34.0+5.g1762237'
使用 Java 流 API 并行化缩减 CPU 时会发生类似的行为:
int n = 10;
float inputArray[] = new float[n];
ArrayList<Float> inputList = new ArrayList<Float>();
for (int i=0; i<n; i++)
{
inputArray[i] = i+1;
inputList.add(inputArray[i]);
}
Optional<Float> resultStream = inputList.stream().parallel().reduce((x, y) -> x+y*20);
float sequentialResult = array[0];
for (int i = 1; i < array.length; i++)
{
sequentialResult = sequentialResult + array[i] * 20;
}
System.out.println("Sequential Result "+sequentialResult);
// Sequential Result 10541.0
System.out.println("Stream Result "+resultStream.get());
// Stream Result 1.2466232E8
看来,正如 Numba's team, lambda a, b: a + b * 20
isn't associative and commutative 缩减函数所指出的那样, 产生了这个意想不到的结果。