np.array[:,None,:2] 混乱，IoU计算问题

Question

我有当前函数，如下所示，用于计算两个框的 IoU（并集交集）。我理解代码，直到计算出“lt”和“rb”。我不知道这部分代码是什么意思。有人可以帮帮我吗？我还从另一个 post/website 计算了我的 IoU（here), and it gives a different result to what I'm trying to achieve as it seems to miss some items. More info found here，我在那里提出了我的原始问题并且我的 code/purpose 是可见的。

def box_iou_calc(boxes1, boxes2):
    # https://github.com/pytorch/vision/blob/master/torchvision/ops/boxes.py
    """
    Return intersection-over-union (Jaccard index) of boxes.
    Both sets of boxes are expected to be in (x1, y1, x2, y2) format.
    Arguments:
        boxes1 (Array[N, 4])
        boxes2 (Array[M, 4])
    Returns:
        iou (Array[N, M]): the NxM matrix containing the pairwise
            IoU values for every element in boxes1 and boxes2

    This implementation is taken from the above link and changed so that it only uses numpy..
    """

    def box_area(box):
        # box = 4xn
        return (box[2] - box[0]) * (box[3] - box[1])

    area1 = box_area(boxes1.T)
    area2 = box_area(boxes2.T)

    lt = np.maximum(boxes1[:, None, :2], boxes2[:, :2])  # [N,M,2]
    rb = np.minimum(boxes1[:, None, 2:], boxes2[:, 2:])  # [N,M,2]

    inter = np.prod(np.clip(rb - lt, a_min=0, a_max=None), 2)
    return inter / (area1[:, None] + area2 - inter)  # iou = inter / (area1 + area2 - inter)

Answer 1

它说数组是：

    boxes1 (Array[N, 4])
    boxes2 (Array[M, 4])

通过一些 numpy 文档阅读和实验，很明显：

boxes1[:, None, :2]

从 boxes1 中提取 2 列，并添加大小为 1 的维度。生成的形状将是 (N,1,2)。在这种情况下 None 与 np.newaxis 相同。

boxes2[:, :2]

比较简单，就是returns一个(M,2)的形状。

np.maximum 使用广播规则组合 2 个数组：

(N,1,2) with (M,2) => (N,1,2) with (1,M,2) => (N,M,2)

如评论所述。您可以将此视为执行 2 个数组中的一种 outer 最大值，将数组的 M 列中的每一列与另一个数组的 N 列进行比较。

我假设 lt 和 rb 代表这些框集并集的 left 和 right 边界。

破译 numpy 代码时，手头有 numpy 文档和可以测试代码位的交互式会话是个好主意。

（我通常会用我自己的一个小例子来说明这一点，但我当前的计算机设置不允许我这样做。）

np.array[:,None,:2] 混乱，IoU计算问题

np.array[:,None,:2] confusion, IoU calculation issue

python

numpy

object-detection

computer-vision

torchvision