我应该如何使用 OpenCV 去除此图像中的噪声？

Question

我正在尝试使用 cv2.HoughLines 来识别此 image 中单词的倾斜角度。
然而，在边缘检测之后，它clearly has too much noise。

我试过使用 cv2.medianBlur 来消除噪音。
但是，有even more noise。

这意味着我无法设置霍夫变换的最小线长度阈值。

我还应该查看哪些其他功能？

图片：

边缘检测后：

编辑：在 Rotem 的帮助下，我的代码现在可以识别倾斜角度在 90 到 -90 度之间的图像，包括 90 度但不包括 -90 度。

import numpy as np
import imutils
import math
import pytesseract

img = cv2.imread('omezole.jpg')

resized = imutils.resize(img, width=300)
gray = cv2.cvtColor(resized,cv2.COLOR_BGR2GRAY)
th3 = cv2.threshold(gray, 80, 255, cv2.THRESH_BINARY_INV)[1]
minLineLength = 50
maxLineGap = 3
lines = cv2.HoughLinesP(th3, rho=1, theta=np.pi/180, threshold=100, minLineLength=minLineLength, maxLineGap=maxLineGap)
colLineCopy = cv2.cvtColor(th3,cv2.COLOR_GRAY2BGR)

#Draw but remove all vertical lines, add corresponding angle to ls
ls = []
for line in lines:

    if line is None:
        angle = 0
    else:
        x1, y1, x2, y2 = line[0].tolist()
        print(line)
        #check for vertical lines since you can't find tan90
        if (x2-x1==0):
            ls.append(-90)
        else:
            ls.append((math.degrees(math.atan((y2-y1)/(x2-x1)))))
            cv2.line(colLineCopy, (x1,y1), (x2,y2), (0,0,250), 2)

#special case of strictly vertical words, if more than 0.2 of the lines are vertical assume, words are vertical
if ls.count(-90)>len(ls)//5:
    angle = 90
else:
    for angle in ls:
        if angle < -80:
            ls.remove(angle)
    angle = sum(ls)/len(ls)

rotated = imutils.rotate_bound(resized, -angle)
cv2.imshow("HoughLinesP", colLineCopy)
cv2.imshow("rotated", rotated)

gray = cv2.cvtColor(rotated, cv2.COLOR_BGR2GRAY)
threshINV  = cv2.threshold(gray, 100, 255, cv2.THRESH_BINARY_INV)[1]

cv2.imshow("final", threshINV)
#Run OCR
pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
custom_config = r'--psm 11'
print(pytesseract.image_to_string(threshINV, config = custom_config))
cv2.waitKey(0)
cv2.destroyAllWindows

``

Answer 1

在使用边缘检测之前移除 "noise" 的一种有用方法是应用将图像从灰度图像转换为二值图像的阈值。

（自动）找到正确的阈值并不总是一件容易的事。
我手动将阈值设置为50。

使用 HoughLinesP 代码示例的解决方案：

import numpy as np
import cv2

# Read input image
img = cv2.imread('omezole.jpg')

# Convert from RGB to Grayscale.
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Apply threshold - all values below 50 goes to 0, and values above 50 goes to 1.
ret, thresh_gray = cv2.threshold(gray, 50, 255, cv2.THRESH_BINARY)


# https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_houghlines/py_houghlines.html
edges = cv2.Canny(thresh_gray, 50, 150, apertureSize = 3)


minLineLength = 100
maxLineGap = 5
lines = cv2.HoughLinesP(edges, rho=1, theta=np.pi/180, threshold=100, minLineLength=minLineLength, maxLineGap=maxLineGap)

# Draw lines
for line in lines:
    x1, y1, x2, y2 = line[0].tolist()
    cv2.line(img, (x1,y1), (x2,y2), (0,255,0), 2)

cv2.imwrite('houghlines.png',img)

结果：

HoughLines 解决方案不太可靠。
我建议使用 findContours 的另一种解决方案：

img = cv2.imread('omezole.jpg')

# Inverse polarity:
thresh_gray = 255 - thresh_gray;

# Use "open" morphological operation to remove some rough edges
thresh_gray = cv2.morphologyEx(thresh_gray, cv2.MORPH_OPEN, np.ones((5, 5)))


# Find contours over thresh_gray
cnts = cv2.findContours(thresh_gray, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2]

# Iterate contours
for c in cnts:
    # Only if contour area is large enough:
    if cv2.contourArea(c) > 2000:
        rect = cv2.minAreaRect(c)
        box = cv2.boxPoints(rect)
        # convert all coordinates floating point values to int
        box = np.int0(box)
        cv2.drawContours(img, [box], 0, (0, 255, 0), thickness=2)
        angle = rect[2]
        print('angle = ' + str(angle))

cv2.imwrite('findcontours.png', img)

# Show result (for testing).
cv2.imshow('thresh_gray', thresh_gray)
cv2.imshow('img', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

结果：

angle = -21.801406860351562
angle = -21.44773292541504
angle = -21.370620727539062
angle = -21.801406860351562
angle = -22.520565032958984
angle = -22.56700897216797
angle = -23.198591232299805

我应该如何使用 OpenCV 去除此图像中的噪声？

How should I remove noise in this image using OpenCV?

python

ocr

opencv

image-processing