优化数据类型以提高速度

Question

为了编写高效的代码，您应该使用尽可能简单的数据类型。对于 Renderscript 来说更是如此，其中相同的计算在内核中重复了很多次。现在，我想编写一个非常简单的内核，它将一个（颜色）位图作为输入并生成一个 int[] 数组作为输出：

#pragma version(1)
#pragma rs java_package_name(com.example.xxx)
#pragma rs_fp_relaxed

uint __attribute__((kernel)) grauInt(uchar4 in) {
uint gr= (uint) (0.21*in.r + 0.72*in.g + 0.07*in.b);    
return gr;  
}

Java 边:

int[] data1 = new int[width*height];
ScriptC_gray graysc;
graysc=new ScriptC_gray(rs);
Type.Builder TypeOut = new Type.Builder(rs, Element.U32(rs));
TypeOut.setX(width).setY(height);
Allocation outAlloc = Allocation.createTyped(rs, TypeOut.create());

Allocation inAlloc = Allocation.createFromBitmap(rs, bmpfoto1,     Allocation.MipmapControl.MIPMAP_NONE, Allocation.USAGE_SCRIPT);
graysc.forEach_grauInt(inAlloc,outAlloc);
outAlloc.copyTo(data1);

对于 60 万像素的位图，这在我的三星 S5 (5.0) 上需要 40 毫秒，在我的三星 Tab2(4.2) 上需要 180 毫秒。现在我试图优化。由于输出实际上是一个 8 位无符号整数（0-255），我尝试了以下操作：

uchar __attribute__((kernel)) grauInt(uchar4 in) {
uchar gr= 0.2125*in.r + 0.7154*in.g + 0.0721*in.b;
return gr;
}

并在 Java 中将第 4 行更改为：

Type.Builder TypeOut = new Type.Builder(rs, Element.U8(rs));

但是，这会产生错误“32 位整数源与分配类型 UNSIGNED_8 不匹配”。我对此的解释是 forEach_grauInt(inAlloc,outAlloc) 语句在输入和输出端期望相同的元素类型。因此，我尝试“断开”输入和输出分配，并将输入分配（位图）视为全局变量 bmpAllocIn，如下所示：

#pragma version(1)
#pragma rs java_package_name(com.example.dani.oldgauss)
#pragma rs_fp_relaxed

rs_allocation bmpAllocIn;
int32_t width;
int32_t height;

uchar __attribute__((kernel)) grauInt(uint32_t x, uint32_t y) {
uchar4 c=rsGetElementAt_uchar4(bmpAllocIn, x, y);
uchar gr= (uchar) 0.2125*c.r + 0.7154*c.g + 0.0721*c.b;
return gr;
}

与Java方：

int[] data1 = new int[width*height];
ScriptC_gray graysc;
graysc=new ScriptC_gray(rs);

graysc.set_bmpAllocIn(Allocation.createFromBitmap(rs,bmpfoto1));
Type.Builder TypeOut = new Type.Builder(rs, Element.U8(rs));
TypeOut.setX(width).setY(height);
Allocation outAlloc = Allocation.createTyped(rs, TypeOut.create());

graysc.forEach_grauInt(outAlloc);
outAlloc.copyTo(data1);

现在令人惊讶的是，我再次收到相同的错误消息：“32 位整数源与分配类型 UNSIGNED_8 不匹配”。这是我无法理解的。我在这里做错了什么？

Answer 1

原因是

int[] data1 = new int[width * height];

行。您正在尝试使用它创建的数组作为 copyTo() 的目标，这会引发异常。改成

byte[] data1 = new byte[width * height];

一切都会好起来的。顺便说一句，输入和输出分配可以是不同的类型。

附带说明一下，您还可以从 RS 过滤器中完全消除浮点计算，这将提高某些架构的性能。

优化数据类型以提高速度

Optimizing data types to increase speed

types

allocation

renderscript