什么是排序算法(以及它在 GPU 上运行的效率如何)
What's that for a sorting algorithm (and how effective it runs on GPU)
我已经实现了一个用于排序像素的着色器:
void main()
{
vec4 renderedImagePixel = texture2D(CalculatedImage,v_texcoord);
if(int(numRenderPass) == int(v_texcoord.y*float(height)) && fbo ){
vec2 coordTHIS = vec2(v_texcoord.x,v_texcoord.y-1.0/float(height));
float THIS = unpack(texture2D(CalculatedImage,coordTHIS));
renderedImagePixel = texture2D(CalculatedImage,coordTHIS);
if(sieveCycle == true){ //even Cycle
//Even cell
if(mod(int(v_texcoord.x*float(width)),2) == 1){
//CHANGES IN CODE HERE
vec2 coordTHAT = vec2(v_texcoord.x+1.0/float(width),v_texcoord.y-1.0/float(height));
float THAT = unpack(texture2D(CalculatedImage,coordTHAT));
//CHANGES IN CODE HERE
if( THIS > THAT ){
renderedImagePixel = texture2D(CalculatedImage,coordTHAT);
}
}else{
//Odd cell
//CHANGES IN CODE HERE
vec2 coordTHAT = vec2(v_texcoord.x-1.0/float(width),v_texcoord.y-1.0/float(height));
float THAT = unpack(texture2D(CalculatedImage,coordTHAT));
//CHANGES IN CODE HERE
if( THAT > THIS ){
renderedImagePixel = texture2D(CalculatedImage,coordTHAT);
}
}
}else{ //odd cycle
//Even cell
if(mod(int(v_texcoord.x*float(width)),2) == 0){
//CHANGES IN CODE HERE
vec2 coordTHAT = vec2(v_texcoord.x+1.0/float(width),v_texcoord.y-1.0/float(height));
float THAT = unpack(texture2D(CalculatedImage,coordTHAT));
//CHANGES IN CODE HERE
if( THIS > THAT ){
renderedImagePixel = texture2D(CalculatedImage,coordTHAT);
}
}else{
//Odd cell
//CHANGES IN CODE HERE
vec2 coordTHAT = vec2(v_texcoord.x-1.0/float(width),v_texcoord.y-1.0/float(height));
float THAT = unpack(texture2D(CalculatedImage,coordTHAT));
//CHANGES IN CODE HERE
if( THAT > THIS ){
renderedImagePixel = texture2D(CalculatedImage,coordTHAT);
}
}
}
}
gl_FragColor = renderedImagePixel;
}
现在我想知道它的效果如何?
我的想法是,如果在一个循环中计算每个像素,则算法在最坏的情况下应该是 O(n)。这是正确的吗。万一它只是冒泡排序的派生词呢
这是排序示例的图像:
您正在实施 sorting network 并且使用了奇偶排序算法。
在输出完全排序之前需要 O(n) 次迭代。
然而,存在更高效的算法,例如 bitonic sort。这只需要 O(log² n)
次迭代。着色器将变得更加复杂,因为 other
值将根据迭代次数发生变化。
您可以通过使用另一个纹理来简化它,其中将对另一个索引进行编码以及是否取最小值或最大值。
我已经实现了一个用于排序像素的着色器:
void main()
{
vec4 renderedImagePixel = texture2D(CalculatedImage,v_texcoord);
if(int(numRenderPass) == int(v_texcoord.y*float(height)) && fbo ){
vec2 coordTHIS = vec2(v_texcoord.x,v_texcoord.y-1.0/float(height));
float THIS = unpack(texture2D(CalculatedImage,coordTHIS));
renderedImagePixel = texture2D(CalculatedImage,coordTHIS);
if(sieveCycle == true){ //even Cycle
//Even cell
if(mod(int(v_texcoord.x*float(width)),2) == 1){
//CHANGES IN CODE HERE
vec2 coordTHAT = vec2(v_texcoord.x+1.0/float(width),v_texcoord.y-1.0/float(height));
float THAT = unpack(texture2D(CalculatedImage,coordTHAT));
//CHANGES IN CODE HERE
if( THIS > THAT ){
renderedImagePixel = texture2D(CalculatedImage,coordTHAT);
}
}else{
//Odd cell
//CHANGES IN CODE HERE
vec2 coordTHAT = vec2(v_texcoord.x-1.0/float(width),v_texcoord.y-1.0/float(height));
float THAT = unpack(texture2D(CalculatedImage,coordTHAT));
//CHANGES IN CODE HERE
if( THAT > THIS ){
renderedImagePixel = texture2D(CalculatedImage,coordTHAT);
}
}
}else{ //odd cycle
//Even cell
if(mod(int(v_texcoord.x*float(width)),2) == 0){
//CHANGES IN CODE HERE
vec2 coordTHAT = vec2(v_texcoord.x+1.0/float(width),v_texcoord.y-1.0/float(height));
float THAT = unpack(texture2D(CalculatedImage,coordTHAT));
//CHANGES IN CODE HERE
if( THIS > THAT ){
renderedImagePixel = texture2D(CalculatedImage,coordTHAT);
}
}else{
//Odd cell
//CHANGES IN CODE HERE
vec2 coordTHAT = vec2(v_texcoord.x-1.0/float(width),v_texcoord.y-1.0/float(height));
float THAT = unpack(texture2D(CalculatedImage,coordTHAT));
//CHANGES IN CODE HERE
if( THAT > THIS ){
renderedImagePixel = texture2D(CalculatedImage,coordTHAT);
}
}
}
}
gl_FragColor = renderedImagePixel;
}
现在我想知道它的效果如何?
我的想法是,如果在一个循环中计算每个像素,则算法在最坏的情况下应该是 O(n)。这是正确的吗。万一它只是冒泡排序的派生词呢
这是排序示例的图像:
您正在实施 sorting network 并且使用了奇偶排序算法。
在输出完全排序之前需要 O(n) 次迭代。
然而,存在更高效的算法,例如 bitonic sort。这只需要 O(log² n)
次迭代。着色器将变得更加复杂,因为 other
值将根据迭代次数发生变化。
您可以通过使用另一个纹理来简化它,其中将对另一个索引进行编码以及是否取最小值或最大值。