使用实例渲染进行全局半透明排序

Global translucency sorting with instanced rendering

我的渲染代码的结构使得有模型,也有模型实例。 每个模型可以有 N 个实例,同一模型的所有可见实例都使用实例化渲染同时渲染。

就性能而言,这工作得很好——我的代码需要渲染成百上千个实例,每个实例都可能由多个渲染调用组成,数量 render/uniform/texture/etc。通话是个问题。

当我想考虑使用半透明的实例时,问题就来了,在这种情况下是很多 - 在这种情况下,渲染它们的顺序很重要,因为模型使用各种混合函数。 我可以对每个模型的实例进行排序,但是一旦有多个模型的实例被渲染,顺序是任意的(实际上它是基于模型加载的顺序)。

我这辈子想不出任何方法来使用实例化渲染进行这种全局排序。

这可能吗?实例化渲染是否应该纯粹用于不透明对象?

我的代码使用 WebGL1,它缺少很多现代功能,但我很想知道这是否可行,即使只是在更现代的 API.

实例化归根结底是一种性能优化;没有它就无法通过实例化来渲染。这只是您进行了多少次绘图调用的问题。

如果渲染不同网格的顺序很重要(对于混合来说是这样),那么您不能使用实例化。如果您必须将所有内容从后向前绘制才能使渲染工作正常,那么这就是您必须做的。无论需要多少次绘制调用。

如果你的问题是你能否将用 gl.drawArraysInstanced / gl.drawElementsInstanced 绘制的对象与其他东西一起排序,答案是 "No"

如果您的问题是有其他方法可以优化答案"Yes"。这真的取决于你的瓶颈在哪里。

例如,您可以 "pull vertices" 这基本上意味着您将顶点数据放入纹理中。完成后,您现在拥有随机访问顶点,因此您可以按任何顺序绘制模型。您必须使用模型 ID 和/或模型顶点偏移更新至少一个缓冲区或纹理,但这可能比使用单独的绘制调用绘制每个模型更快。

This talk 不使用顶点拉动,但它确实表明更新大量对象的缓冲区比为每个对象单独调用绘制要快得多。类似技术是否适合您的用例取决于您

举个例子。它将 4 个模型(立方体、球体、圆柱体、环面)的数据放入一个纹理 (vertexDataTexture) 中。然后它将要绘制的每个对象的数据放入单独的纹理 (perObjectDataTexture)。把这些想象成制服。在这种情况下,每个对象都有一个模型矩阵和一种颜色。

perObjectDataTexture 每帧更新一次,包含所有统一数据。

它只有 1 个名为 perVertexData 的属性。对于每个顶点都有一个 vertexId(使用哪个顶点,用于从 vertexDataTexture 获取顶点数据)和一个 objectId,用于从 perObjectDataTexture 获取每个对象的数据。

如果您想更改排序顺序,则必须每帧填充该属性的缓冲区。

结果是在 1 个绘制调用中绘制了来自 4 个不同模型的 2000 个独立对象。实际上,我们制作了比标准实例更灵活的自己的实例。从像这样的纹理中提取数据可能比不慢,但 1 次绘制调用和 2 次数据上传可能比 2000 次绘制调用 + 所有额外的制服调用更快(虽然我没有测试所以可能更慢)

const m4 = twgl.m4;
const v3 = twgl.v3;
const gl = document.querySelector('canvas').getContext('webgl');
const ext = gl.getExtension('OES_texture_float');
if (!ext) {
  alert('need OES_texture_float');
}

const COMMON_STUFF = `
#define TEXTURE_WIDTH 5.0
#define MATRIX_ROW_0_OFFSET ((0. + 0.5) / TEXTURE_WIDTH)
#define MATRIX_ROW_1_OFFSET ((1. + 0.5) / TEXTURE_WIDTH)
#define MATRIX_ROW_2_OFFSET ((2. + 0.5) / TEXTURE_WIDTH)
#define MATRIX_ROW_3_OFFSET ((3. + 0.5) / TEXTURE_WIDTH)
#define COLOR_OFFSET        ((4. + 0.5) / TEXTURE_WIDTH)
`;

const vs = `
attribute vec2 perVertexData;

uniform float perObjectDataTextureHeight;  // NOTE: in WebGL2 use textureSize()
uniform sampler2D perObjectDataTexture;

uniform vec2 vertexDataTextureSize;  // NOTE: in WebGL2 use textureSize()
uniform sampler2D vertexDataTexture;

uniform mat4 projection;
uniform mat4 view;

varying vec3 v_normal;
varying float v_objectId;

${COMMON_STUFF}

void main() {
  float vertexId = perVertexData.x;
 float objectId = perVertexData.y;

  v_objectId = objectId;  // pass to fragment shader

  float objectOffset = (objectId + 0.5) / perObjectDataTextureHeight;

  // note: in WebGL2 better to use texelFetch
  mat4 model = mat4(
    texture2D(perObjectDataTexture, vec2(MATRIX_ROW_0_OFFSET, objectOffset)),
    texture2D(perObjectDataTexture, vec2(MATRIX_ROW_1_OFFSET, objectOffset)),
    texture2D(perObjectDataTexture, vec2(MATRIX_ROW_2_OFFSET, objectOffset)),
    texture2D(perObjectDataTexture, vec2(MATRIX_ROW_3_OFFSET, objectOffset)));
    
  
  // note: in WebGL2 better to use texelFetch
  // note: vertexId will be even numbers since there are 2 pieces of data
  //       per vertex, position and normal.
  vec2 colRow = vec2(mod(vertexId, vertexDataTextureSize.x),
                     floor(vertexId / vertexDataTextureSize.x)) + 0.5;
  vec2 baseUV = colRow / vertexDataTextureSize;
  vec4 position = texture2D(vertexDataTexture, baseUV);
  vec3 normal = texture2D(vertexDataTexture, baseUV + vec2(1) / vertexDataTextureSize).xyz;
  
  gl_Position = projection * view * model * position;
  v_normal = mat3(view) * mat3(model) * normal;
}
`;

const fs = `
precision highp float;

varying vec3 v_normal;
varying float v_objectId;

uniform float perObjectDataTextureHeight;
uniform sampler2D perObjectDataTexture;
uniform vec3 lightDirection;

${COMMON_STUFF}

void main() {
  float objectOffset = (v_objectId + 0.5) / perObjectDataTextureHeight;

  // maybe we should look this up in the vertex shader
  vec4 color = texture2D(perObjectDataTexture, vec2(COLOR_OFFSET, objectOffset));
  
  float l = dot(lightDirection, normalize(v_normal)) * .5 + .5;
  
  gl_FragColor = vec4(color.rgb * l, color.a);
}
`;

// compile shader, link, look up locations
const programInfo = twgl.createProgramInfo(gl, [vs, fs]);

// make some vertex data
const modelVerts = [
  twgl.primitives.createSphereVertices(1, 6, 4),
  twgl.primitives.createCubeVertices(1, 1, 1),
  twgl.primitives.createCylinderVertices(1, 1, 10, 1),
  twgl.primitives.createTorusVertices(1, .2, 16, 8),
].map(twgl.primitives.deindexVertices);
const modelVertexCounts = [];
const modelVertexOffsets = [];
{
 let offset = 0;
  modelVerts.forEach((verts) => {
    let vertexCount = verts.position.length / 3;
    modelVertexCounts.push(vertexCount);
    modelVertexOffsets.push(offset);
    offset += vertexCount;  
  });
}
// merge all the vertices into one
const arrays = twgl.primitives.concatVertices(modelVerts);

// copy arrays into texture.
function copyPositionsAndNormalsIntoTexture(arrays) {
  const maxTextureSize = gl.getParameter(gl.MAX_TEXTURE_SIZE);
  const numVerts = arrays.position.length / 3;
  const numPixels = numVerts * 2;  // each vertex will have position and normal
  const numPixelsNeeded = ((numPixels + maxTextureSize - 1) / maxTextureSize | 0) * maxTextureSize;
  const data = new Float32Array(numPixelsNeeded * 4); // RGBA
  for (let i = 0; i < numVerts; ++i) {
    const src = i * 3;
  const dst = i * 2 * 4;
  data[dst    ] = arrays.position[src    ];
  data[dst + 1] = arrays.position[src + 1];
  data[dst + 2] = arrays.position[src + 2];
    data[dst + 3] = 1;
  data[dst + 4] = arrays.normal[src    ];
  data[dst + 5] = arrays.normal[src + 1];
  data[dst + 6] = arrays.normal[src + 2];
    data[dst + 7] = 1;
  }
  const height = numPixelsNeeded / maxTextureSize;
  const texture = gl.createTexture();
  gl.bindTexture(gl.TEXTURE_2D, texture);
  gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA, maxTextureSize, height, 0, gl.RGBA, gl.FLOAT, data);
  gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.NEAREST);
  gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.NEAREST);
  gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE);
  gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_T, gl.CLAMP_TO_EDGE);
  return {
    texture,
    size: [maxTextureSize, height],
  };
}

const vertexDataTextureInfo = copyPositionsAndNormalsIntoTexture(arrays);

let numTotalVerts = 0;
const numObjects = 2000;
const objects = [];
for (let i = 0; i < numObjects; ++i) {
  const modelId = r() * modelVerts.length | 0; 
  numTotalVerts += modelVertexCounts[modelId];
  objects.push({
    modelId,
    objectId: i,
  });
}

// for every vertex we need 2 pieces of data
// 1. An objectId (used to look up per object data)
// 2. An vertexID (used to look up the vertex)
const perVertexData = new Uint16Array(numTotalVerts * 2);
// calls gl.createBuffer, gl.bindBuffer, gl.bufferData
const bufferInfo = twgl.createBufferInfoFromArrays(gl, {
  perVertexData: {
   numComponents: 2,
    data: perVertexData,
  },
});

const perObjectDataTexture = gl.createTexture();
const perObjectDataTextureWidth = 5; // 4x4 matrix, 4x1 color
gl.bindTexture(gl.TEXTURE_2D, perObjectDataTexture);
gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA, perObjectDataTextureWidth, numObjects, 0, gl.RGBA, gl.FLOAT, null);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.NEAREST);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.NEAREST);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_T, gl.CLAMP_TO_EDGE);

// this data is for the texture, one row per model
// first 4 pixels are the model matrix, 5 pixel is the color
const perObjectData = new Float32Array(perObjectDataTextureWidth * numObjects * 4);
const stride = perObjectDataTextureWidth * 4;
const modelOffset = 0;
const colorOffset = 16;

// set the colors at init time
for (let objectId = 0; objectId < numObjects; ++objectId) {
  perObjectData.set([r(), r(), r(), 1], objectId * stride + colorOffset);
}

function r() {
  return Math.random();
}

const RANDOM_RANGE = Math.pow(2, 32);
let seed = 0;
function pseudoRandom() {
  return (seed =
          (134775813 * seed + 1) %
          RANDOM_RANGE) / RANDOM_RANGE;
}

function resetPseudoRandom() {
  seed = 0;
}


function render(time) {
  time *= 0.001;  // seconds
  
  twgl.resizeCanvasToDisplaySize(gl.canvas);
  
  gl.clear(gl.COLOR_BUFFER_BIT | gl.DEPTH_BUFFER_BIT);
  gl.viewport(0, 0, gl.canvas.width, gl.canvas.height);
  gl.enable(gl.DEPTH_TEST);
  gl.enable(gl.CULL_FACE);

  const fov = Math.PI * 0.25;
  const aspect = gl.canvas.clientWidth / gl.canvas.clientHeight;
  const near = 0.1;
  const far = 20;
  const projection = m4.perspective(fov, aspect, near, far);
  
  const eye = [0, 0, 15];
  const target = [0, 0, 0];
  const up = [0, 1, 0];
  const camera = m4.lookAt(eye, target, up);
  const view = m4.inverse(camera);

  // set the matrix for each object in the texture data
  resetPseudoRandom();
  const mat = m4.identity();
  for (let objectId = 0; objectId < numObjects; ++objectId) {
    // of course you'd probably store translation, rotation, etc per object in objects[]
    const t = time * (0.3 + pseudoRandom() * 0.1) + pseudoRandom() * Math.PI * 2;
    
    m4.identity(mat);
    m4.rotateX(mat, t * 0.93, mat);
    m4.rotateY(mat, t * 0.87, mat);
    m4.translate(mat, [
      1 + pseudoRandom() * 3,
      1 + pseudoRandom() * 3,
      1 + pseudoRandom() * 3,
    ], mat);
    m4.rotateZ(mat, t * 1.17, mat);
    
    perObjectData.set(mat, objectId * stride);
  }
  
  // set the per vertex data. (sort objects before this line)
  {
    let offset = 0;
    for (const obj of objects) {
     const numVerts = modelVertexCounts[obj.modelId];
      const vertOffset = modelVertexOffsets[obj.modelId];
      for (let v = 0; v < numVerts; ++v) {
       perVertexData[offset++] = (vertOffset + v) * 2;  // 2 is because 2 pixels per vertex, one for position, one for normal
        perVertexData[offset++] = obj.objectId; 
      }
    }
  }
  // upload the per vertex data
  gl.bindBuffer(gl.ARRAY_BUFFER, bufferInfo.attribs.perVertexData.buffer);
  gl.bufferSubData(gl.ARRAY_BUFFER, 0, perVertexData);
  
  // upload the texture data
  gl.bindTexture(gl.TEXTURE_2D, perObjectDataTexture);
  gl.texSubImage2D(gl.TEXTURE_2D, 0, 0, 0, perObjectDataTextureWidth, numObjects, 
                   gl.RGBA, gl.FLOAT, perObjectData);
  
  gl.useProgram(programInfo.program);
  
  // calls gl.bindBuffer, gl.enableVertexAttribArray, gl.vertexAttribPointer
  twgl.setBuffersAndAttributes(gl, programInfo, bufferInfo);
  
  // calls gl.activeTexture, gl.bindTexture, gl.uniformXXX
  twgl.setUniforms(programInfo, {
    lightDirection: v3.normalize([1, 2, 3]),
    perObjectDataTexture,
    perObjectDataTextureHeight: numObjects,
    vertexDataTexture: vertexDataTextureInfo.texture,
    vertexDataTextureSize: vertexDataTextureInfo.size,
    projection,
    view,
  });  
  
  // calls gl.drawArrays or gl.drawElements
  twgl.drawBufferInfo(gl, bufferInfo);

  requestAnimationFrame(render);
}
requestAnimationFrame(render);
body { margin: 0; }
canvas { width: 100vw; height: 100vh; display: block; }
<script src="https://twgljs.org/dist/4.x/twgl-full.min.js"></script>
<canvas></canvas>

一些注意事项:

  • 我很懒,让 perObjectDataTexture 每个对象只有一行。这意味着您最多可以拥有 gl.getParameter(gl.MAX_TEXTURE_SIZE) 个对象。要解决此问题,您需要更改每个对象数据在纹理中的存储方式,然后修复着色器 uv 数学以找到数据的排列方式

  • 我在片段着色器中查找颜色,而不是从顶点着色器传入它。变化的数量有限。我认为 8 通常是可用的最小值。可以说,使用它们而不是仅仅传递 objectId 并在片段着色器中进行所有数学计算会更好。