Python 从简单图片中提取数字

Question

我有如下图片

lower = np.array([175, 125, 45], dtype="uint8")
upper = np.array([255, 255, 255], dtype="uint8")

mask = cv2.inRange(image, lower, upper)
img = cv2.bitwise_and(image, image, mask=mask)

plt.figure()
plt.imshow(img)
plt.axis('off')
plt.show()

现在如果我尝试像这样转换成灰度：

gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)

我明白了：

我想提取上面的号码。

建议：

gray = 255 - gray
emp = np.full_like(gray, 255)
emp -= gray
emp[emp==0] = 255
emp[emp<100] = 0
gauss = cv2.GaussianBlur(emp, (3,3), 1)
gauss[gauss<220] = 0
plt.imshow(gauss)

给出图像：

然后在任何图像上使用 pytesseract：

data = pytesseract.image_to_string(img, config='outputbase digits')

给出：

'\x0c'

另一个建议的解决方案是：

gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
thr = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV)[1]
txt = pytesseract.image_to_string(thr)
plt.imshow(thr)

这给出了

'\x0c'

不太满意...请问有人有更好的解决方案吗？

谢谢！

Answer 1

有两个问题阻止了 pytessract 检测您的号码：

数字周围的白色矩形（反转和填充是解决方案）。
数字形状中的噪声（高斯平滑处理）

AlexAlex 提出的解决方案如果在后面加上 高斯滤波器 将完美运行：

output: 1,625

import numpy as np
import pytesseract
import cv2

BGR = cv2.imread('11.png')
RGB = cv2.cvtColor(BGR, cv2.COLOR_BGR2RGB)

lower = np.array([175, 125, 45], dtype="uint8")
upper = np.array([255, 255, 255], dtype="uint8")

mask = cv2.inRange(RGB, lower, upper)
img = cv2.bitwise_and(RGB, RGB, mask=mask)

gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)

gray = 255 - gray
emp = np.full_like(gray, 255)
emp -= gray

emp[emp==0] = 255
emp[emp<100] = 0

gauss = cv2.GaussianBlur(emp, (3,3), 1)
gauss[gauss<220] = 0

text = pytesseract.image_to_string(gauss, config='outputbase digits')

print(text)

Answer 2

我有一个两步解决方案

1. 应用thresholding
1. 将模式设置为 7。

当您对图像应用阈值时：

阈值化是显示图像特征的最简单方法。

现在从输出图像，当我们读到：

txt = image_to_string(thr, config="--psm 7")
print(txt)

结果将是：

| 1,625 |

现在为什么要将 page-segmentation-mode (psm) 模式设置为 7？

好吧，将图像视为单个文本行会给出准确的结果。

但我们必须修改结果。由于当前结果是| 1,625 |

我们应该删除 |

print("".join([t for t in txt if t != '|']))

结果：

1,625

代码：

import cv2
from pytesseract import image_to_string

img = cv2.imread("LZ3vi.png")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thr = cv2.threshold(gry, 0, 255,
                    cv2.THRESH_BINARY_INV)[1]
txt = image_to_string(thr, config="--psm 7")
print("".join([t for t in txt if t != '|']).strip())

更新

how do you get this clean black and white image from my original image?

使用 3 步

1. 使用opencv的imread函数读取图像
- ```
img = cv2.imread("LZ3vi.png")
```
- 现在我们以 BGR 的方式阅读图像。（不是 RGB）

将图像转换为灰度图像

gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

结果将是：

应用阈值

thr = cv2.threshold(gry, 0, 255, cv2.THRESH_BINARY_INV)[1]

结果将是：

现在，如果您想了解阈值。阅读 simple-threhsolding

All my filters, grayscale... get weird colored images

原因是，当你使用pyplot显示图像时，你需要将color-map (cmap)设置为gray

plt.imshow(img, cmap='gray')

您可以阅读其他类型here

Python 从简单图片中提取数字

Python Extract number from simple Image

opencv

image-processing

python-3.x

python-tesseract