RuntimeError: The size of tensor a (10) must match the size of tensor b (3) at non-singleton dimension 0

RuntimeError: The size of tensor a (10) must match the size of tensor b (3) at non-singleton dimension 0

我在联合代码上使用这个交集来根据我的预测和目标确定 IOU:

def intersection_over_union(boxes_preds, boxes_labels):
    """
    Calculates intersection over union
    Parameters:
        boxes_preds (tensor): Predictions of Bounding Boxes (BATCH_SIZE, 4)
        boxes_labels (tensor): Correct labels of Bounding Boxes (BATCH_SIZE, 4)
        box_format (str): midpoint/corners, if boxes (x,y,w,h) or (x1,y1,x2,y2)
    Returns:
        tensor: Intersection over union for all examples
    """


    box1_x1 = boxes_preds[..., 0:1]
    box1_y1 = boxes_preds[..., 1:2]
    box1_x2 = boxes_preds[..., 2:3]
    box1_y2 = boxes_preds[..., 3:4]  # (N, 1)
    box2_x1 = boxes_labels[..., 0:1]
    box2_y1 = boxes_labels[..., 1:2]
    box2_x2 = boxes_labels[..., 2:3]
    box2_y2 = boxes_labels[..., 3:4]

    x1 = torch.max(box1_x1, box2_x1)
    y1 = torch.max(box1_y1, box2_y1)
    x2 = torch.min(box1_x2, box2_x2)
    y2 = torch.min(box1_y2, box2_y2)

    # .clamp(0) is for the case when they do not intersect
    intersection = (x2 - x1).clamp(0) * (y2 - y1).clamp(0)

    box1_area = abs((box1_x2 - box1_x1) * (box1_y2 - box1_y1))
    box2_area = abs((box2_x2 - box2_x1) * (box2_y2 - box2_y1))

    return intersection / (box1_area + box2_area - intersection + 1e-6)

我的输入是这样的:

我的目标边界框如下所示: print(targets[0]['boxes'])

tensor([[217., 481., 249., 511.],
        [435., 191., 467., 223.],
        [471.,  86., 503., 118.]])

我的预测边界框如下所示: predictions['boxes']

tensor([[ 29.7859, 354.9666,  63.0900, 387.6363],
        [469.1072,  85.6840, 503.1974, 119.7137],
        [ 89.3957, 314.1584, 123.9789, 347.1621],
        [432.2971, 188.4454, 468.4712, 227.3808],
        [214.5407, 482.0136, 248.7030, 512.0000],
        [329.1979, 340.8802, 366.3720, 375.8683],
        [298.5089,  99.0098, 334.4280, 129.4205],
        [  0.0000, 347.7724,  17.3409, 384.5709],
        [485.4312, 181.3882, 512.0000, 213.2009],
        [144.5959, 356.5197, 183.4489, 387.4958]])

但是,当我应用 IOU 函数时:

iou = intersection_over_union(predictions['boxes'], targets[0]['boxes'])

我收到这个错误:

RuntimeError: The size of tensor a (10) must match the size of tensor b (3) at non-singleton dimension 0

我不确定如何修复该函数,因为我猜这意味着我的预测比目标多...

IoU不能这样表示,一般是用标量代码写的。在这里,您需要将目标集中的每一帧与预测中的每一帧进行比较,“torch.max()”不能以这种方式工作。尝试表达(或只是从某处复制)IoU 的标量 python 代码,当它起作用时,如果强烈需要,你可以尝试使用一些张量操作来优化它。

我选择了 torchvisions IOU 示例:

iou = torchvision.ops.box_iou(predictions['boxes'], targets[0]['boxes'])

这里没有错误。

我另外找到了这个实现(来自 https://github.com/amdegroot/ssd.pytorch/blob/master/layers/box_utils.py#L48):


def intersect(box_a, box_b):
    """ We resize both tensors to [A,B,2] without new malloc:
    [A,2] -> [A,1,2] -> [A,B,2]
    [B,2] -> [1,B,2] -> [A,B,2]
    Then we compute the area of intersect between box_a and box_b.
    Args:
      box_a: (tensor) bounding boxes, Shape: [A,4].
      box_b: (tensor) bounding boxes, Shape: [B,4].
    Return:
      (tensor) intersection area, Shape: [A,B].
    """
    A = box_a.size(0)
    B = box_b.size(0)
    max_xy = torch.min(box_a[:, 2:].unsqueeze(1).expand(A, B, 2),
                       box_b[:, 2:].unsqueeze(0).expand(A, B, 2))
    min_xy = torch.max(box_a[:, :2].unsqueeze(1).expand(A, B, 2),
                       box_b[:, :2].unsqueeze(0).expand(A, B, 2))
    inter = torch.clamp((max_xy - min_xy), min=0)
    return inter[:, :, 0] * inter[:, :, 1]


def jaccard(box_a, box_b):
    """Compute the jaccard overlap of two sets of boxes.  The jaccard overlap
    is simply the intersection over union of two boxes.  Here we operate on
    ground truth boxes and default boxes.
    E.g.:
        A ∩ B / A ∪ B = A ∩ B / (area(A) + area(B) - A ∩ B)
    Args:
        box_a: (tensor) Ground truth bounding boxes, Shape: [num_objects,4]
        box_b: (tensor) Prior boxes from priorbox layers, Shape: [num_priors,4]
    Return:
        jaccard overlap: (tensor) Shape: [box_a.size(0), box_b.size(0)]
    """
    inter = intersect(box_a, box_b)
    area_a = ((box_a[:, 2]-box_a[:, 0]) *
              (box_a[:, 3]-box_a[:, 1])).unsqueeze(1).expand_as(inter)  # [A,B]
    area_b = ((box_b[:, 2]-box_b[:, 0]) *
              (box_b[:, 3]-box_b[:, 1])).unsqueeze(0).expand_as(inter)  # [A,B]
    union = area_a + area_b - inter
    return inter / union  # [A,B]

应用 jaccard 产生与 torchvisions 自己的函数相同的输出:

iou = jaccard(predictions['boxes'], targets[0]['boxes'])

打印借条时的张量示例:

tensor([[0.0000, 0.0000],
        [0.0000, 0.0000],
        [0.0000, 0.0000],
        [0.0000, 0.0000],
        [0.0000, 0.9322],
        [0.8021, 0.0000],
        [0.0000, 0.0000],
        [0.0000, 0.0000]])