没有插值内核的去隔行
deinterlacing with no interpolation kernel
我正在研究 OpenCL 内核以解决特定的实时去隔行扫描问题。我有 帧 (RGB, 720*480*3),它们由 4 个交错场 组成。所以我正在尝试 deinterlace 到维度的原始字段 G (width/4)*( 高度/4),使用以下等式:
G_i = 1/4( f(x', y') + f(x' + 1, y') + f(x' + 2, y') + f(x' + 3, y') )
其中 i = 0, 1, 2, 3
和(x', y') = (4x, 4y + i)
结果场 G_i 因此将具有帧像素的 1/16,最终我打算单独使用去隔行场。
这是我到目前为止所取得的成就,但我已经奋斗了很长一段时间,但我还没有完全做到。有人可以帮忙吗?我想我需要 4 宽的步幅才能穿过扁平的框架?
调用OpenCL程序(在PyOpenCL中):
# call limiting the global work space to 1/16th of the frame
# outputting to array of 1/16th the size
self.program.deinterlace(self.queue, (self.dim[0]/self.n, self.dim[1]/self.n),
None,
self.frame_buf, self.dest_buf,
np.int32(self.dim[1]),
np.int32(self.dim[2]))
result = np.empty((self.dim[0]/self.n, self.dim[1]/self.n, 3),
dtype=np.uint8)
cl.enqueue_copy(self.queue, result, self.dest_buf).wait()
OpenCL 内核:
__kernel void deinterlace(
__global const uchar *a,
__global uchar *c,
const int width,
const int channels
)
{
int rowid = get_global_id(0);
int colid = get_global_id(1);
int ncols = width;
int nchan = channels;
int index = rowid * 4 * ncols * 4 * nchan + colid * 4 * nchan;
int newindex = rowid * ncols * nchan + colid * nchan;
c[newindex + 0] = a[index + 0];
c[newindex + 1] = a[index + 1];
c[newindex + 2] = a[index + 2];
}
框架:
结果:
是的,所以我只是忘了缩小迭代范围...
调用OpenCL程序应该是:
# call limiting the global work space to 1/16th of the frame
# outputting to array of 1/16th the size
self.program.deinterlace(self.queue, (self.dim[0]/self.n, self.dim[1]/self.n),
None,
self.frame_buf, self.dest_buf,
np.int32(self.dim[1]/self.n),
np.int32(self.dim[2]))
result = np.empty((self.dim[0]/self.n, self.dim[1]/self.n, 3),
dtype=np.uint8)
cl.enqueue_copy(self.queue, result, self.dest_buf).wait()
我正在研究 OpenCL 内核以解决特定的实时去隔行扫描问题。我有 帧 (RGB, 720*480*3),它们由 4 个交错场 组成。所以我正在尝试 deinterlace 到维度的原始字段 G (width/4)*( 高度/4),使用以下等式:
G_i = 1/4( f(x', y') + f(x' + 1, y') + f(x' + 2, y') + f(x' + 3, y') )
其中 i = 0, 1, 2, 3
和(x', y') = (4x, 4y + i)
结果场 G_i 因此将具有帧像素的 1/16,最终我打算单独使用去隔行场。
这是我到目前为止所取得的成就,但我已经奋斗了很长一段时间,但我还没有完全做到。有人可以帮忙吗?我想我需要 4 宽的步幅才能穿过扁平的框架?
调用OpenCL程序(在PyOpenCL中):
# call limiting the global work space to 1/16th of the frame
# outputting to array of 1/16th the size
self.program.deinterlace(self.queue, (self.dim[0]/self.n, self.dim[1]/self.n),
None,
self.frame_buf, self.dest_buf,
np.int32(self.dim[1]),
np.int32(self.dim[2]))
result = np.empty((self.dim[0]/self.n, self.dim[1]/self.n, 3),
dtype=np.uint8)
cl.enqueue_copy(self.queue, result, self.dest_buf).wait()
OpenCL 内核:
__kernel void deinterlace(
__global const uchar *a,
__global uchar *c,
const int width,
const int channels
)
{
int rowid = get_global_id(0);
int colid = get_global_id(1);
int ncols = width;
int nchan = channels;
int index = rowid * 4 * ncols * 4 * nchan + colid * 4 * nchan;
int newindex = rowid * ncols * nchan + colid * nchan;
c[newindex + 0] = a[index + 0];
c[newindex + 1] = a[index + 1];
c[newindex + 2] = a[index + 2];
}
框架:
结果:
是的,所以我只是忘了缩小迭代范围...
调用OpenCL程序应该是:
# call limiting the global work space to 1/16th of the frame
# outputting to array of 1/16th the size
self.program.deinterlace(self.queue, (self.dim[0]/self.n, self.dim[1]/self.n),
None,
self.frame_buf, self.dest_buf,
np.int32(self.dim[1]/self.n),
np.int32(self.dim[2]))
result = np.empty((self.dim[0]/self.n, self.dim[1]/self.n, 3),
dtype=np.uint8)
cl.enqueue_copy(self.queue, result, self.dest_buf).wait()