过滤图像以提高文本识别
Filtering Image For Improving Text Recognition
我在下面有这张源图像(裁剪后),我尝试在阅读文本之前进行一些图像处理。
用python和opencv,我尝试用k=2的k-means去除背景中的线条,结果是
我尝试使用下面的代码对图像进行平滑处理
def process_image_for_ocr(file_path):
# TODO : Implement using opencv
temp_filename = set_image_dpi(file_path)
im_new = remove_noise_and_smooth(temp_filename)
return im_new
def set_image_dpi(file_path):
im = Image.open(file_path)
length_x, width_y = im.size
factor = max(1, int(IMAGE_SIZE / length_x))
size = factor * length_x, factor * width_y
# size = (1800, 1800)
im_resized = im.resize(size, Image.ANTIALIAS)
temp_file = tempfile.NamedTemporaryFile(delete=False, suffix='.jpg')
temp_filename = temp_file.name
im_resized.save(temp_filename, dpi=(300, 300))
return temp_filename
def image_smoothening(img):
ret1, th1 = cv2.threshold(img, BINARY_THREHOLD, 255, cv2.THRESH_BINARY)
ret2, th2 = cv2.threshold(th1, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
blur = cv2.GaussianBlur(th2, (1, 1), 0)
ret3, th3 = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
return th3
def remove_noise_and_smooth(file_name):
img = cv2.imread(file_name, 0)
filtered = cv2.adaptiveThreshold(img.astype(np.uint8), 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 41, 3)
kernel = np.ones((1, 1), np.uint8)
opening = cv2.morphologyEx(filtered, cv2.MORPH_OPEN, kernel)
closing = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, kernel)
img = image_smoothening(img)
or_image = cv2.bitwise_or(img, closing)
return or_image
结果是
你能帮我(任何想法)删除源图像背景上的线条吗?
实现此目的的一种方法是计算图像的 k 均值无监督分割。您只需要使用 k
和 i_val
值即可获得所需的输出。
首先,您需要创建一个函数,它将找到 k
阈值 values.This 简单地计算用于计算 k_means 的图像直方图。 .ravel()
只是将您的 numpy 数组转换为一维数组。 np.reshape(img, (-1,1))
然后将其转换为形状为 n,1
的二维数组。接下来我们按照 here.
的描述执行 k_means
该函数采用输入灰度图像、您的 k
间隔数和您想要作为阈值的值 (i_val
)。它 returns 您想要的阈值 i_val
。
def kmeans(input_img, k, i_val):
hist = cv2.calcHist([input_img],[0],None,[256],[0,256])
img = input_img.ravel()
img = np.reshape(img, (-1, 1))
img = img.astype(np.float32)
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
flags = cv2.KMEANS_RANDOM_CENTERS
compactness,labels,centers = cv2.kmeans(img,k,None,criteria,10,flags)
centers = np.sort(centers, axis=0)
return centers[i_val].astype(int), centers, hist
img = cv2.imread('Y8CSE.jpg', 0)
_, thresh = cv2.threshold(img, kmeans(input_img=img, k=8, i_val=2)[0], 255, cv2.THRESH_BINARY)
cv2.imwrite('text.png',thresh)
输出如下:
您可以使用 morphological operators, or pre-mask the image using a hough transform as seen in the first answer 继续此方法。
我在下面有这张源图像(裁剪后),我尝试在阅读文本之前进行一些图像处理。
用python和opencv,我尝试用k=2的k-means去除背景中的线条,结果是
我尝试使用下面的代码对图像进行平滑处理
def process_image_for_ocr(file_path):
# TODO : Implement using opencv
temp_filename = set_image_dpi(file_path)
im_new = remove_noise_and_smooth(temp_filename)
return im_new
def set_image_dpi(file_path):
im = Image.open(file_path)
length_x, width_y = im.size
factor = max(1, int(IMAGE_SIZE / length_x))
size = factor * length_x, factor * width_y
# size = (1800, 1800)
im_resized = im.resize(size, Image.ANTIALIAS)
temp_file = tempfile.NamedTemporaryFile(delete=False, suffix='.jpg')
temp_filename = temp_file.name
im_resized.save(temp_filename, dpi=(300, 300))
return temp_filename
def image_smoothening(img):
ret1, th1 = cv2.threshold(img, BINARY_THREHOLD, 255, cv2.THRESH_BINARY)
ret2, th2 = cv2.threshold(th1, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
blur = cv2.GaussianBlur(th2, (1, 1), 0)
ret3, th3 = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
return th3
def remove_noise_and_smooth(file_name):
img = cv2.imread(file_name, 0)
filtered = cv2.adaptiveThreshold(img.astype(np.uint8), 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 41, 3)
kernel = np.ones((1, 1), np.uint8)
opening = cv2.morphologyEx(filtered, cv2.MORPH_OPEN, kernel)
closing = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, kernel)
img = image_smoothening(img)
or_image = cv2.bitwise_or(img, closing)
return or_image
结果是
你能帮我(任何想法)删除源图像背景上的线条吗?
实现此目的的一种方法是计算图像的 k 均值无监督分割。您只需要使用 k
和 i_val
值即可获得所需的输出。
首先,您需要创建一个函数,它将找到 k
阈值 values.This 简单地计算用于计算 k_means 的图像直方图。 .ravel()
只是将您的 numpy 数组转换为一维数组。 np.reshape(img, (-1,1))
然后将其转换为形状为 n,1
的二维数组。接下来我们按照 here.
该函数采用输入灰度图像、您的 k
间隔数和您想要作为阈值的值 (i_val
)。它 returns 您想要的阈值 i_val
。
def kmeans(input_img, k, i_val):
hist = cv2.calcHist([input_img],[0],None,[256],[0,256])
img = input_img.ravel()
img = np.reshape(img, (-1, 1))
img = img.astype(np.float32)
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
flags = cv2.KMEANS_RANDOM_CENTERS
compactness,labels,centers = cv2.kmeans(img,k,None,criteria,10,flags)
centers = np.sort(centers, axis=0)
return centers[i_val].astype(int), centers, hist
img = cv2.imread('Y8CSE.jpg', 0)
_, thresh = cv2.threshold(img, kmeans(input_img=img, k=8, i_val=2)[0], 255, cv2.THRESH_BINARY)
cv2.imwrite('text.png',thresh)
输出如下:
您可以使用 morphological operators, or pre-mask the image using a hough transform as seen in the first answer