如何获得并行 GPU 像素渲染?对于体素光线追踪

How to get parallel GPU pixel rendering? For voxel ray tracing

我使用计算着色器和纹理在 Unity 中制作了一个体素光线投射器。但在 1080p 下,它在 30 fps 下的视距仅限于 100。还没有光反弹什么的,我对这个表现很失望。

我尝试学习 Vulkan,最好的教程都是基于光栅化的,我想我真正想做的就是在 GPU 上并行计算像素。我熟悉 CUDA 并且我读过它有时用于渲染?或者是否有一种简单的方法可以在 Vulcan 中并行计算像素?我已经有了一个打开空白的模板 Vulkan 项目 window。我不需要从 GPU 取回任何数据,只需在提供数据后直接渲染到屏幕即可。

使用下面的代码,与 Unity 计算着色器相比,Vulkan 中的速度会快得多吗?它有很多 if/else 语句,我读过这些语句对 GPU 不利,但我想不出任何其他方式来编写它。

编辑:我尽可能地优化了它,但它仍然很慢,比如 1080p 时 30 fps。

这是计算着色器:

#pragma kernel CSMain

RWTexture2D<float4> Result; // the actual array of pixels the player sees
const float width; // in pixels
const float height;

const StructuredBuffer<int> voxelMaterials; // for now just getting a flat voxel array
const int voxelBufferRowSize;
const int voxelBufferPlaneSize;
const int voxelBufferSize;
const StructuredBuffer<float3> rayDirections; // I'm now actually using it as points instead of directions
const float maxRayDistance;

const float3 playerCameraPosition; // relative to the voxelData, ie the first voxel's bottom, back, left corner position, no negative coordinates
const float3 playerWorldForward;
const float3 playerWorldRight;
const float3 playerWorldUp;

[numthreads(8,8,1)]
void CSMain (uint3 id : SV_DispatchThreadID)
{
    Result[id.xy] = float4(0, 0, 0, 0); // setting the pixel to black by default
    float3 pointHolder = playerCameraPosition; // initializing the first point to the player's position
    const float3 p = rayDirections[id.x + (id.y * width)]; // vector transformation getting the world space directions of the rays relative to the player
    const float3 u1 = p.x * playerWorldRight;
    const float3 u2 = p.y * playerWorldUp;
    const float3 u3 = p.z * playerWorldForward;
    const float3 direction = u1 + u2 + u3; // the direction to that point

    float distanceTraveled = 0;
    int3 directionAxes; // 1 for positive, 0 for zero, -1 for negative
    int3 directionIfReplacements = { 0, 0, 0 }; // 1 for positive, 0 for zero, -1 for negative
    float3 axesUnit = { 1 / abs(direction.x), 1 / abs(direction.y), 1 / abs(direction.z) };
    float3 distancesXYZ = { 1000, 1000, 1000 };
    int face = 0; // 1 = x, 2 = y, 3 = z // the current face the while loop point is on

    // comparing the floats once in the beginning so the rest of the ray traversal can compare ints
    if (direction.x > 0) {
        directionAxes.x = 1;
        directionIfReplacements.x = 1;
    }
    else if (direction.x < 0) {
        directionAxes.x = -1;
    }
    else {
        distanceTraveled = maxRayDistance; // just ending the ray for now if one of it's direction axes is exactly 0. You'll see a line of black pixels if the player's rotation is zero but this never happens naturally
        directionAxes.x = 0;
    }
    if (direction.y > 0) {
        directionAxes.y = 1;
        directionIfReplacements.y = 1;
    }
    else if (direction.y < 0) {
        directionAxes.y = -1;
    }
    else {
        distanceTraveled = maxRayDistance;
        directionAxes.y = 0;
    }
    if (direction.z > 0) {
        directionAxes.z = 1;
        directionIfReplacements.z = 1;
    }
    else if (direction.z < 0) {
        directionAxes.z = -1;
    }
    else {
        distanceTraveled = maxRayDistance;
        directionAxes.z = 0;
    }

    // calculating the first point
    if (playerCameraPosition.x < voxelBufferRowSize &&
        playerCameraPosition.x >= 0 &&
        playerCameraPosition.y < voxelBufferRowSize &&
        playerCameraPosition.y >= 0 &&
        playerCameraPosition.z < voxelBufferRowSize &&
        playerCameraPosition.z >= 0)
    {
        int voxelIndex = floor(playerCameraPosition.x) + (floor(playerCameraPosition.z) * voxelBufferRowSize) + (floor(playerCameraPosition.y) * voxelBufferPlaneSize); // the voxel index in the flat array

        switch (voxelMaterials[voxelIndex]) {
        case 1:
            Result[id.xy] = float4(1, 0, 0, 0);
            distanceTraveled = maxRayDistance; // to end the while loop
            break;
        case 2:
            Result[id.xy] = float4(0, 1, 0, 0);
            distanceTraveled = maxRayDistance;
            break;
        case 3:
            Result[id.xy] = float4(0, 0, 1, 0);
            distanceTraveled = maxRayDistance;
            break;
        default:
            break;
        }
    }

    // traversing the ray beyond the first point
    while (distanceTraveled < maxRayDistance) 
    {
        switch (face) {
        case 1:
            distancesXYZ.x = axesUnit.x;
            distancesXYZ.y = (floor(pointHolder.y + directionIfReplacements.y) - pointHolder.y) / direction.y;
            distancesXYZ.z = (floor(pointHolder.z + directionIfReplacements.z) - pointHolder.z) / direction.z;
            break;
        case 2:
            distancesXYZ.y = axesUnit.y;
            distancesXYZ.x = (floor(pointHolder.x + directionIfReplacements.x) - pointHolder.x) / direction.x;
            distancesXYZ.z = (floor(pointHolder.z + directionIfReplacements.z) - pointHolder.z) / direction.z;
            break;
        case 3:
            distancesXYZ.z = axesUnit.z;
            distancesXYZ.x = (floor(pointHolder.x + directionIfReplacements.x) - pointHolder.x) / direction.x;
            distancesXYZ.y = (floor(pointHolder.y + directionIfReplacements.y) - pointHolder.y) / direction.y;
            break;
        default:
            distancesXYZ.x = (floor(pointHolder.x + directionIfReplacements.x) - pointHolder.x) / direction.x;
            distancesXYZ.y = (floor(pointHolder.y + directionIfReplacements.y) - pointHolder.y) / direction.y;
            distancesXYZ.z = (floor(pointHolder.z + directionIfReplacements.z) - pointHolder.z) / direction.z;
            break;
        }

        face = 0; // 1 = x, 2 = y, 3 = z
        float smallestDistance = 1000;
        if (distancesXYZ.x < smallestDistance) {
            smallestDistance = distancesXYZ.x;
            face = 1;
        }
        if (distancesXYZ.y < smallestDistance) {
            smallestDistance = distancesXYZ.y;
            face = 2;
        }
        if (distancesXYZ.z < smallestDistance) {
            smallestDistance = distancesXYZ.z;
            face = 3;
        }
        if (smallestDistance == 0) {
            break;
        }

        int3 facesIfReplacement = { 1, 1, 1 };
        switch (face) { // directionIfReplacements is positive if positive but I want to subtract so invert it to subtract 1 when negative subtract nothing when positive
        case 1:
            facesIfReplacement.x = 1 - directionIfReplacements.x;
            break;
        case 2:
            facesIfReplacement.y = 1 - directionIfReplacements.y;
            break;
        case 3:
            facesIfReplacement.z = 1 - directionIfReplacements.z;
            break;
        }

        pointHolder += direction * smallestDistance; // the acual ray marching
        distanceTraveled += smallestDistance;

        int3 voxelIndexXYZ = { -1,-1,-1 }; // the integer coordinates within the buffer
        voxelIndexXYZ.x = ceil(pointHolder.x - facesIfReplacement.x);
        voxelIndexXYZ.y = ceil(pointHolder.y - facesIfReplacement.y);
        voxelIndexXYZ.z = ceil(pointHolder.z - facesIfReplacement.z);

        //check if voxelIndexXYZ is within bounds of the voxel buffer before indexing the array
        if (voxelIndexXYZ.x < voxelBufferRowSize &&
            voxelIndexXYZ.x >= 0 &&
            voxelIndexXYZ.y < voxelBufferRowSize &&
            voxelIndexXYZ.y >= 0 &&
            voxelIndexXYZ.z < voxelBufferRowSize &&
            voxelIndexXYZ.z >= 0)
        {
            int voxelIndex = voxelIndexXYZ.x + (voxelIndexXYZ.z * voxelBufferRowSize) + (voxelIndexXYZ.y * voxelBufferPlaneSize); // the voxel index in the flat array
            switch (voxelMaterials[voxelIndex]) {
            case 1:
                Result[id.xy] = float4(1, 0, 0, 0) * (1 - (distanceTraveled / maxRayDistance));
                distanceTraveled = maxRayDistance; // to end the while loop
                break;
            case 2:
                Result[id.xy] = float4(0, 1, 0, 0) * (1 - (distanceTraveled / maxRayDistance));
                distanceTraveled = maxRayDistance;
                break;
            case 3:
                Result[id.xy] = float4(0, 0, 1, 0) * (1 - (distanceTraveled / maxRayDistance));
                distanceTraveled = maxRayDistance;
                break;
            }
        }
        else {
            break; // should be uncommented in actual game implementation where the player will always be inside the voxel buffer
        }
    }
}

根据你给它的体素数据,它会产生这个:

这里是“优化”它并取出所有分支或发散条件语句(我认为)之后的着色器:

#pragma kernel CSMain

RWTexture2D<float4> Result; // the actual array of pixels the player sees
float4 resultHolder;
const float width; // in pixels
const float height;

const Buffer<int> voxelMaterials; // for now just getting a flat voxel array
const Buffer<float4> voxelColors;
const int voxelBufferRowSize;
const int voxelBufferPlaneSize;
const int voxelBufferSize;
const Buffer<float3> rayDirections; // I'm now actually using it as points instead of directions
const float maxRayDistance;

const float3 playerCameraPosition; // relative to the voxelData, ie the first voxel's bottom, back, left corner position, no negative coordinates
const float3 playerWorldForward;
const float3 playerWorldRight;
const float3 playerWorldUp;

[numthreads(16, 16, 1)]
void CSMain(uint3 id : SV_DispatchThreadID)
{
    resultHolder = float4(0, 0, 0, 0); // setting the pixel to black by default
    float3 pointHolder = playerCameraPosition; // initializing the first point to the player's position
    const float3 p = rayDirections[id.x + (id.y * width)]; // vector transformation getting the world space directions of the rays relative to the player
    const float3 u1 = p.x * playerWorldRight;
    const float3 u2 = p.y * playerWorldUp;
    const float3 u3 = p.z * playerWorldForward;
    const float3 direction = u1 + u2 + u3; // the transformed ray direction in world space
    const bool anyDir0 = direction.x == 0 || direction.y == 0 || direction.z == 0; // preventing a division by zero
    float distanceTraveled = maxRayDistance * anyDir0;

    const float3 nonZeroDirection = { // to prevent a division by zero
        direction.x + (1 * anyDir0),
        direction.y + (1 * anyDir0),
        direction.z + (1 * anyDir0)
    };
    const float3 axesUnits = { // the distances if the axis is an integer
        1.0f / abs(nonZeroDirection.x),
        1.0f / abs(nonZeroDirection.y),
        1.0f / abs(nonZeroDirection.z)
    };
    const bool3 isDirectionPositiveOr0 = {
        direction.x >= 0,
        direction.y >= 0,
        direction.z >= 0
    };

    while (distanceTraveled < maxRayDistance)
    {
        const bool3 pointIsAnInteger = {
            (int)pointHolder.x == pointHolder.x,
            (int)pointHolder.y == pointHolder.y,
            (int)pointHolder.z == pointHolder.z
        };

        const float3 distancesXYZ = {
            ((floor(pointHolder.x + isDirectionPositiveOr0.x) - pointHolder.x) / direction.x * !pointIsAnInteger.x)  +  (axesUnits.x * pointIsAnInteger.x),
            ((floor(pointHolder.y + isDirectionPositiveOr0.y) - pointHolder.y) / direction.y * !pointIsAnInteger.y)  +  (axesUnits.y * pointIsAnInteger.y),
            ((floor(pointHolder.z + isDirectionPositiveOr0.z) - pointHolder.z) / direction.z * !pointIsAnInteger.z)  +  (axesUnits.z * pointIsAnInteger.z)
        };

        float smallestDistance = min(distancesXYZ.x, distancesXYZ.y);
        smallestDistance = min(smallestDistance, distancesXYZ.z);

        pointHolder += direction * smallestDistance;
        distanceTraveled += smallestDistance;

        const int3 voxelIndexXYZ = {
            floor(pointHolder.x) - (!isDirectionPositiveOr0.x && (int)pointHolder.x == pointHolder.x), 
            floor(pointHolder.y) - (!isDirectionPositiveOr0.y && (int)pointHolder.y == pointHolder.y),
            floor(pointHolder.z) - (!isDirectionPositiveOr0.z && (int)pointHolder.z == pointHolder.z)
        };

        const bool inBounds = (voxelIndexXYZ.x < voxelBufferRowSize && voxelIndexXYZ.x >= 0) && (voxelIndexXYZ.y < voxelBufferRowSize && voxelIndexXYZ.y >= 0) && (voxelIndexXYZ.z < voxelBufferRowSize && voxelIndexXYZ.z >= 0);

        const int voxelIndexFlat = (voxelIndexXYZ.x + (voxelIndexXYZ.z * voxelBufferRowSize) + (voxelIndexXYZ.y * voxelBufferPlaneSize)) * inBounds; // meaning the voxel on 0,0,0 will always be empty and act as a our index out of range prevention

        if (voxelMaterials[voxelIndexFlat] > 0) {
            resultHolder = voxelColors[voxelMaterials[voxelIndexFlat]] * (1 - (distanceTraveled / maxRayDistance));
            break;
        }   
        if (!inBounds) break;
    }
    Result[id.xy] = resultHolder;
}

计算着色器就是这样:一个在 GPU 上运行的程序,无论是在 vulkan 上还是在 Unity 上,所以无论哪种方式,您都可以并行执行。然而,vulkan 的要点在于,它可以让您更好地控制在 GPU 上执行的命令——同步、内存等。因此,vulkan 不一定比 unity 更快。所以,你应该做的实际上是 optimise your shaders.

此外,if/else 的主要问题是 . So, if you can avoid it, the performance impact will be far lessened. These 可能会帮助您。


如果您仍想在 vulkan 中完成所有操作...

由于您不打算进行任何三角形光栅化,因此您可能不需要教程通常显示的渲染通道或图形管道。相反,您将需要一个计算着色器管道。这些比图形管道简单得多,只需要一个着色器和管道布局(输入和输出通过描述符集绑定)。

您只需将交换链图像作为描述符中的 storage image 传递给计算着色器(当然,您的着色器可能需要的任何其他数据,都通过描述符传递)。为此,您需要在交换链创建结构中指定 VK_IMAGE_USAGE_STORAGE_BIT

然后,在您的命令缓冲区中,您将描述符集与图像和其他数据绑定,绑定计算管道,并像您在 Unity 中所做的那样分派它。交换链演示和提交命令缓冲区应该与教程中的图形工作方式没有什么不同。