如何从该图像中提取圆形文本？

Question

我有一个包含圆形文本的图像。在此图像中，有两个 cicles。我想从图像中删除内圈文本，并提取外圈文本。如何去除内圈文字，去除内圈文字后，如何提取外圈文字？解决这个问题的步骤是什么？

输入图片：

Answer 1

你的图像是一个很好的玩具示例，可以用来玩弄 cv2.warpPolar，所以我编写了一些代码，我也将在这里分享。所以，这就是我的方法：

对输入图像进行灰度化和二值化处理，主要是去除 JPG 伪影。
裁剪图像的中心部分以去除左右的大面积区域，因为稍后我们会找到轮廓，这样就变得不那么困难了。
查找（嵌套）轮廓，参见。 cv2.RETR_TREE。请参阅以获取有关等高线层次结构的详尽解释。
将找到的轮廓按面积过滤排序，只保留四个与圆相关的轮廓（两个圆的内外边缘）。
通过使用内圈的轮廓简单地绘制来删除内部文本。

如果明确需要，也对原始图像执行此操作。
在重新映射之前旋转图像，cf。链接的 cv2.warpPolar 文档中的解释。将图像重新映射到极坐标，并旋转结果以获得正确的 OCR。
运行 pytesseract 仅将大写字母列入白名单。

这是具有正确输出的完整代码：

import cv2
import pytesseract

# Read image
img = cv2.imread('fcJAc.jpg')

# Convert to grayscale, and binarize, especially for removing JPG artifacts
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY_INV)[1]

# Crop center part of image to simplify following contour detection
h, w = gray.shape
l = (w - h) // 2
gray = gray[:, l:l+h]

# Find (nested) contours (cf. cv2.RETR_TREE) w.r.t. the OpenCV version
cnts = cv2.findContours(gray, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

# Filter and sort contours on area
cnts = [cnt for cnt in cnts if cv2.contourArea(cnt) > 10000]
cnts = sorted(cnts, key=cv2.contourArea)

# Remove inner text by painting over using found contours
# Contour index 1 = outer edge of inner circle
gray = cv2.drawContours(gray, cnts, 1, 0, cv2.FILLED)

# If specifically needed, also remove text in the original image
# Contour index 0 = inner edge of inner circle (to keep inner circle itself)
img[:, l:l+h] = cv2.drawContours(img[:, l:l+h], cnts, 0, (255, 255, 255),
                                 cv2.FILLED)

# Rotate image before remapping to polar coordinate space to maintain
# circular text en bloc after remapping
gray = cv2.rotate(gray, cv2.ROTATE_90_COUNTERCLOCKWISE)

# Actual remapping to polar coordinate space
gray = cv2.warpPolar(gray, (-1, -1), (h // 2, h // 2), h // 2,
                     cv2.INTER_CUBIC + cv2.WARP_POLAR_LINEAR)

# Rotate result for OCR
gray = cv2.rotate(gray, cv2.ROTATE_90_COUNTERCLOCKWISE)

# Actual OCR, limiting to capital letters only
config = '--psm 6 -c tessedit_char_whitelist="ABCDEFGHIJKLMNOPQRSTUVWXYZ "'
text = pytesseract.image_to_string(gray, config=config)
print(text.replace('\n', '').replace('\f', ''))
# CIRCULAR TEXT PHOTOSHOP TUTORIAL

----------------------------------------
System information
----------------------------------------
Platform:      Windows-10-10.0.19041-SP0
Python:        3.9.1
PyCharm:       2021.1.1
OpenCV:        4.5.2
pytesseract:   5.0.0-alpha.20201127
----------------------------------------

如何从该图像中提取圆形文本？

How to extract the circular text from that image?

python

opencv

computer-vision

python-tesseract