高效计算 3d 点云中基于网格的点密度

Efficiently calculating grid-based point density in 3d point cloud

我有一个 3d 点云矩阵,我正在尝试计算矩阵内较小体积内的最大点密度。我目前正在使用 3D 网格直方图系统,我在其中循环遍历矩阵中的每个点并增加相应网格方块的值。然后,我可以简单地找到网格矩阵的最大值。

我已经编写了可以运行的代码,但是对于我正在尝试做的事情来说它太慢了

import numpy as np

def densityPointCloud(points, gridCount, gridSize):
    hist = np.zeros((gridCount, gridCount, gridCount), np.uint16)

    rndPoints = np.rint(points/gridSize) + int(gridCount/2)
    rndPoints = rndPoints.astype(int)


    for point in rndPoints:
        if np.amax(point) < gridCount and np.amin(point) >= 0:
            hist[point[0]][point[1]][point[2]] += 1

    return hist


cloud = (np.random.rand(100000, 3)*10)-5
histogram = densityPointCloud(cloud , 50, 0.2)
print(np.amax(histogram))

有什么捷径可以更有效地做到这一点吗?

这是一个开始:

import numpy as np
import time
from collections import Counter

# if you need the whole histogram object
def dpc2(points, gridCount, gridSize):

    hist = np.zeros((gridCount, gridCount, gridCount), np.uint16)
    rndPoints = np.rint(points/gridSize) + int(gridCount/2)
    rndPoints = rndPoints.astype(int)
    inbounds = np.logical_and(np.amax(rndPoints,axis = 1) < gridCount, np.amin(rndPoints,axis = 1) >= 0)

    for point in rndPoints[inbounds,:]:
        hist[point[0]][point[1]][point[2]] += 1

    return hist

# just care about a max point
def dpc3(points, gridCount, gridSize):

    rndPoints = np.rint(points/gridSize) + int(gridCount/2)
    rndPoints = rndPoints.astype(int)
    inbounds = np.logical_and(np.amax(rndPoints,axis = 1) < gridCount,
        np.amin(rndPoints,axis = 1) >= 0)
    # cheap hashing
    phashes = gridCount*gridCount*rndPoints[inbounds,0] + gridCount*rndPoints[inbounds,1] + rndPoints[inbounds,2]
    max_h, max_v = Counter(phashes).most_common(1)[0]

    max_coord = [(max_h // (gridCount*gridCount)) % gridCount,(max_h // gridCount) % gridCount,max_h % gridCount]
    return (max_coord, max_v)

# TESTING
cloud = (np.random.rand(200000, 3)*10)-5
t1 = time.perf_counter()
hist1 = densityPointCloud(cloud , 50, 0.2)
t2 = time.perf_counter()
hist2 = dpc2(cloud,50,0.2)
t3 = time.perf_counter()
hist3 = dpc3(cloud,50,0.2)
t4 = time.perf_counter()
print(f"task 1: {round(1000*(t2-t1))}ms\ntask 2: {round(1000*(t3-t2))}ms\ntask 3: {round(1000*(t4-t3))}ms")
print(f"max value is {hist3[1]}, achieved at {hist3[0]}")
np.all(np.equal(hist1,hist2)) # check that results are identical
# check for equal max - histogram may be multi-modal so the point won't
# necessarily match
np.unravel_index(np.argmax(hist2, axis=None), hist2.shape)

我们的想法是一次性完成所有 if/and 比较:让 numpy 执行它们(在 C 中有效)而不是在 Python 循环中执行它们 'manually'。这也让我们只迭代将导致 hist 递增的点。

如果您认为您的云中会有很多空的space,您也可以考虑为 hist 使用稀疏数据结构 - 内存分配可能成为非常大数据的瓶颈。

没有对此进行科学基准测试,但似乎 运行 快了 ~2-3 倍 (v2) 和 6-8 倍 (v3)!如果您想要 all 与最大值并列的点数。密度,很容易从 Counter 对象中提取它们。