如何使用 Python OpenCV 将图像裁剪为仅文本部分？

Question

我想裁剪图像以仅提取文本部分。有成千上万个不同大小的，所以我不能硬编码坐标。我正在尝试删除左侧和底部不需要的线条。我该怎么做？

Original	Expected

Answer 1

通过找到图像中的所有 non-zero 点来确定最小跨度边界框。最后，使用此边界框裁剪图像。查找等高线是 time-consuming 并且在这里是不必要的，特别是因为您的文本是 axis-aligned。您可以通过组合 cv2.findNonZero 和 cv2.boundingRect.

来实现您的目标

希望这能奏效！ :

import numpy as np
import cv2
img = cv2.imread(r"W430Q.png")
  # Read in the image and convert to grayscale
img = img[:-20, :-20]  # Perform pre-cropping
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = 255*(gray < 50).astype(np.uint8)  # To invert the text to white
gray = cv2.morphologyEx(gray, cv2.MORPH_OPEN, np.ones(
    (2, 2), dtype=np.uint8))  # Perform noise filtering
coords = cv2.findNonZero(gray)  # Find all non-zero points (text)
x, y, w, h = cv2.boundingRect(coords)  # Find minimum spanning bounding box
# Crop the image - note we do this on the original image
rect = img[y:y+h, x:x+w]
cv2.imshow("Cropped", rect)  # Show it
cv2.waitKey(0)
cv2.destroyAllWindows()

在上面代码的第四行中，我将阈值设置为低于 50 以使深色文本变为白色。但是，因为这会输出二进制图像，所以我转换为 uint8，然后按 255 缩放。文本实际上是倒置的。

然后，使用 cv2.findNonZero, 我们发现此 image.We 的所有 non-zero 位置，然后将其传递给 cv2.boundingRect，returns top-left 边界框的角，以及它的宽度和高度。最后，我们可以利用它来裁剪图像。这是在原始图像上完成的，而不是倒置版本。

Answer 2

这是一个简单的方法：

获取二值图像。 Load the image, grayscale, Gaussian blur, then Otsu's threshold获取二值black/white图像。
删除水平线。由于我们试图只提取文本，我们删除水平线以帮助我们进行下一步，因此不正确的轮廓不会合并在一起。
将文本合并为一个轮廓。 这个想法是，彼此相邻的字符是文本墙的一部分。所以我们可以 dilate 将单个轮廓组合在一起以获得要提取的单个轮廓。
找到轮廓并提取ROI。我们find contours，按区域对轮廓进行排序，然后使用Numpy切片提取最大的轮廓ROI。

这是每个步骤的可视化：

二进制图像->删除了绿色水平线

1	2

扩张以组合成单个轮廓->检测到的 ROI 以绿色提取

3	4

结果

代码

import cv2
import numpy as np

# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3, 3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Remove horizontal lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (25,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=1)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    cv2.drawContours(thresh, [c], -1, 0, -1)

# Dilate to merge into a single contour
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,30))
dilate = cv2.dilate(thresh, vertical_kernel, iterations=3)

# Find contours, sort for largest contour and extract ROI
cnts, _ = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2:]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)[:-1]
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 4)
    ROI = original[y:y+h, x:x+w]
    break

cv2.imshow('image', image)
cv2.imshow('dilate', dilate)
cv2.imshow('thresh', thresh)
cv2.imshow('ROI', ROI)
cv2.waitKey()

如何使用 Python OpenCV 将图像裁剪为仅文本部分？

How to crop image to only text section with Python OpenCV?

python

opencv

image

image-processing

computer-vision