从图像中提取多个背景中的文本
Extract text in multiple background from image
我有多张背景不同的图片,
我需要忽略背景并从图像中提取数字。例如:
经过测试,我得到了这个结果:
由于背景颜色,提取文本非常困难..
我正在使用此代码:
image = cv2.imread('AA.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 165, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Invert image and perform morphological operations
inverted = 255 - thresh
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (15,3))
close = cv2.morphologyEx(inverted, cv2.MORPH_CLOSE, kernel, iterations=1)
# Find contours and filter using aspect ratio and area
cnts = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.01 * peri, True)
x,y,w,h = cv2.boundingRect(approx)
aspect_ratio = w / float(h)
if (aspect_ratio >= 2.5 or area < 75):
cv2.drawContours(thresh, [c], -1, (255,255,255), -1)
# Blur and perform text extraction
thresh = cv2.GaussianBlur(thresh, (3,3), 0)
data = pytesseract.image_to_string(thresh, lang='eng',config='tessedit_char_whitelist=0123456789 --psm 6')
print(data)
cv2.imshow('close', close)
cv2.imshow('thresh', thresh)
cv2.waitKey()
即使背景颜色发生变化,我如何才能准确地从此图像中提取数字?
修改后编辑结果:
你的阈值是你的问题。以下是我在进行 OCR 之前如何处理 Python/OpenCV 中的图像。
我简单地将阈值设置为 165,使字母变白,背景变黑。然后过滤区域上的轮廓以去除小的无关白色区域。然后反转结果,使白底黑字。
输入:
import cv2
import numpy as np
# load image as HSV and select saturation
img = cv2.imread("numbers.png")
hh, ww, cc = img.shape
# convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# threshold the grayscale image
ret, thresh = cv2.threshold(gray,165,255,0)
# create black image to hold results
results = np.zeros((hh,ww))
# find contours
cntrs = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cntrs = cntrs[0] if len(cntrs) == 2 else cntrs[1]
# Contour filtering and copy contour interior to new black image.
for c in cntrs:
area = cv2.contourArea(c)
if area > 1000:
x,y,w,h = cv2.boundingRect(c)
results[y:y+h,x:x+w] = thresh[y:y+h,x:x+w]
# invert the results image so that have black letters on white background
results = (255 - results)
# write results to disk
cv2.imwrite("numbers_extracted.png", results)
cv2.imshow("THRESH", thresh)
cv2.imshow("RESULTS", results)
cv2.waitKey(0)
cv2.destroyAllWindows()
轮廓过滤前的阈值图像:
轮廓滤波和反演后的结果:
P.S。 cv2.inRange() 可能是 cv2.threshold.
的替代方法
当然,此解决方案可能仅限于这幅图像,因为其他图像可能需要不同的阈值和区域限制值。
我有多张背景不同的图片,
我需要忽略背景并从图像中提取数字。例如:
经过测试,我得到了这个结果:
由于背景颜色,提取文本非常困难..
我正在使用此代码:
image = cv2.imread('AA.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 165, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Invert image and perform morphological operations
inverted = 255 - thresh
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (15,3))
close = cv2.morphologyEx(inverted, cv2.MORPH_CLOSE, kernel, iterations=1)
# Find contours and filter using aspect ratio and area
cnts = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.01 * peri, True)
x,y,w,h = cv2.boundingRect(approx)
aspect_ratio = w / float(h)
if (aspect_ratio >= 2.5 or area < 75):
cv2.drawContours(thresh, [c], -1, (255,255,255), -1)
# Blur and perform text extraction
thresh = cv2.GaussianBlur(thresh, (3,3), 0)
data = pytesseract.image_to_string(thresh, lang='eng',config='tessedit_char_whitelist=0123456789 --psm 6')
print(data)
cv2.imshow('close', close)
cv2.imshow('thresh', thresh)
cv2.waitKey()
即使背景颜色发生变化,我如何才能准确地从此图像中提取数字?
修改后编辑结果:
你的阈值是你的问题。以下是我在进行 OCR 之前如何处理 Python/OpenCV 中的图像。
我简单地将阈值设置为 165,使字母变白,背景变黑。然后过滤区域上的轮廓以去除小的无关白色区域。然后反转结果,使白底黑字。
输入:
import cv2
import numpy as np
# load image as HSV and select saturation
img = cv2.imread("numbers.png")
hh, ww, cc = img.shape
# convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# threshold the grayscale image
ret, thresh = cv2.threshold(gray,165,255,0)
# create black image to hold results
results = np.zeros((hh,ww))
# find contours
cntrs = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cntrs = cntrs[0] if len(cntrs) == 2 else cntrs[1]
# Contour filtering and copy contour interior to new black image.
for c in cntrs:
area = cv2.contourArea(c)
if area > 1000:
x,y,w,h = cv2.boundingRect(c)
results[y:y+h,x:x+w] = thresh[y:y+h,x:x+w]
# invert the results image so that have black letters on white background
results = (255 - results)
# write results to disk
cv2.imwrite("numbers_extracted.png", results)
cv2.imshow("THRESH", thresh)
cv2.imshow("RESULTS", results)
cv2.waitKey(0)
cv2.destroyAllWindows()
轮廓过滤前的阈值图像:
轮廓滤波和反演后的结果:
P.S。 cv2.inRange() 可能是 cv2.threshold.
的替代方法当然,此解决方案可能仅限于这幅图像,因为其他图像可能需要不同的阈值和区域限制值。