打开 CV 捕捉指向特定大小的矩形

Question

我试图在质量下降的页面上检测某种类型的图像，该图像具有旋转和平移方差。我需要从页面中“裁剪”检测到的图像，因此我需要检测到的图像的旋转和坐标。例如复印在 A4 页面上的图像。

我正在使用 SIFT 来检测扫描页面上的对象。这些图像可以旋转和平移，但不会倾斜或有任何透视变形。我使用的是经典（SIFT、SURF、ORB 等）方法，但它采用透视变换来创建边界多边形的 4 个点。这里的问题是因为关键点没有完美对齐（由于图像质量不同，投影假定空间失真并且多边形正确失真。

我想尝试的方法是将检测到的多边形点“捕捉”到输入图像的dimensions/area。这应该允许我确定页面上图像的旋转和平移角度。

我尝试过的事情是（并且失败了）：

过滤关键点以移除异常值以最大限度地减少失真。
Affine/Rotations/etc 矩阵，但是它们假设样本中的点是等距的并且不做近似。
ICP：可能会奏效，但没有足够的样本，而且它似乎更像是一种方法而不是一种方法。我确信有更好的方法。

def detect(img, frame, detector):
    frame = frame.copy()
    kp1, desc1 = detector.detectAndCompute(img, None)
    kp2, desc2 = detector.detectAndCompute(frame, None)

    index_params = dict(algorithm=0, trees=5)
    search_params = dict()
    flann = cv2.FlannBasedMatcher(index_params, search_params)
    matches = flann.knnMatch(desc1, desc2, k=2)
    good_points = []
    for m, n in matches:
        if m.distance < 0.5 * n.distance:
            good_points.append(m)
            if(len(good_points) == 20):
                break

    # out_img=cv2.drawMatches(img, kp1, frame, kp2, good_points, flags=2, outImg=None)
    # plt.figure(figsize = (6*4, 8*4))
    # plt.imshow(out_img)        
    
    if len(good_points) > 10: # at least 6 matches are required
        # Get the matching points
        query_pts = np.float32([kp1[m.queryIdx].pt for m in good_points]).reshape(-1, 1, 2)
        train_pts = np.float32([kp2[m.trainIdx].pt for m in good_points]).reshape(-1, 1, 2)
        
        matrix, mask = cv2.findHomography(query_pts, train_pts, cv2.RANSAC, 5.0)
        matches_mask = mask.ravel().tolist()
        h, w = img.shape
        pts = np.float32([[0, 0], [0, h], [w, h], [w, 0]]).reshape(-1, 1, 2)
        dst = cv2.perspectiveTransform(pts, matrix)
        
        
        overlayImage = cv2.polylines(frame, [np.int32(dst)], True, (0, 0, 0), 3)
        plt.figure(figsize = (6*2, 8*2))
        plt.imshow(overlayImage)

orb = cv2.SIFT_create()
for frame in frames:
    detect(img, frame, orb)

这是一个页面示例，其中包含我们要检测的图片。
蓝线：大小正确的矩形
红线：使用透视变换确定多边形

Answer 1

我偶然发现了一个 post，它向您展示了如何从一组点中提取最小边界框。这非常有效，因为它还公开了旋转。

def detect_ICP(img, frame, detector):
frame = frame.copy()
kp1, desc1 = detector.detectAndCompute(img, None)
kp2, desc2 = detector.detectAndCompute(frame, None)

index_params = dict(algorithm=0, trees=5)
search_params = dict()
flann = cv2.FlannBasedMatcher(index_params, search_params)
matches = flann.knnMatch(desc1, desc2, k=2)
matches = sorted(matches, key = lambda x:x[0].distance + 0.5 * x[1].distance)
good_points = []
for m, n in matches:
    if m.distance < 0.5 * n.distance:
        good_points.append(m)
    
out_img=cv2.drawMatches(img, kp1, frame, kp2, good_points, flags=2, outImg=None)
plt.figure(figsize = (6*4, 8*4))
plt.imshow(out_img)        

if len(good_points) > 10: # at least 6 matches are required
    # Get the matching points
    query_pts = np.float32([kp1[m.queryIdx].pt for m in good_points]).reshape(-1, 1, 2)
    train_pts = np.float32([kp2[m.trainIdx].pt for m in good_points]).reshape(-1, 1, 2)
    
    matrix, mask = cv2.findHomography(query_pts, train_pts, cv2.RANSAC, 5.0)
    # matches_mask = mask.ravel().tolist()
    h, w = img.shape
    pts = np.float32([[0, 0], [0, h], [w, h], [w, 0]]).reshape(-1, 1, 2)
    dst = cv2.perspectiveTransform(pts, matrix)
    
    # determine the minimum bounding box
    minAreaRect = cv2.minAreaRect(dst)    # This will have size and rotation information
    rotatedBox = cv2.boxPoints(minAreaRect)
    rotatedBox = np.float32(rotatedBox).reshape(-1, 1, 2)
    
    overlayImage = cv2.polylines(frame, [np.int32(rotatedBox)], True, (0, 0, 0), 3)
    plt.figure(figsize = (6*2, 8*2))
    plt.imshow(overlayImage)

打开 CV 捕捉指向特定大小的矩形

Open CV snap points to a rectangle of a specific size

opencv

object-detection

surf

sift