Python:检测文本块并将其从图像中删除(OpenCV)
Python: Detecting textblock and deleting it from image (OpenCV)
我目前正在研究如何检测图像上的文本段落以将其删除。
我得到一个输入图像,它与上面给出的图像相似。从那里开始,我想检测评论的 comment/the 消息的正文。不需要点赞、用户名和头像,应将其忽略。然后应从评论中删除正文,但其余部分应保留。
到目前为止我添加了一个阈值并找到了轮廓。问题是评论正文没有被检测为一个部分,而是被检测为各种轮廓。 如何组合它们?此外,我想在找到轮廓后立即将其从图像中删除。背景颜色是 RGB(17, 17, 17),有没有办法在它上面绘画 或者它在 OpenCv 中如何工作?我对它很陌生。
img = cv2.imread("Comment.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, threshold = cv2.threshold(gray, 80, 255, cv2.THRESH_BINARY)
contours, _ = cv2.findContours(threshold, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
结果应该是这样的
感谢帮助,提前致谢!
这个想法真的很简单。使用 morphology 来隔离要检测的文本。使用此图像,创建一个 mask 以删除输入图像中的感兴趣区域并生成最终图像。全部通过形态学。我的答案在C++
,但是实现起来真的很简单:
//Read input image:
std::string imagePath = "C://opencvImages//commentImage.png";
cv::Mat imageInput= cv::imread( imagePath );
//Convert it to grayscale:
cv::Mat grayImg;
cv::cvtColor( imageInput, grayImg, cv::COLOR_BGR2GRAY );
//Get binary image via Otsu:
cv::threshold( grayImg, grayImg, 0, 255 , cv::THRESH_OTSU );
到目前为止,您已经生成了二进制图像。现在,让我们 dilate
图像使用 矩形结构元素 (SE
) 宽大于高。我的想法是,我想水平连接所有文本 AND 垂直(一点点)。如果你看到输入图像,“TEST132212” 文本与评论仅分开一点点,似乎足以在 dilate
操作中幸存下来。让我们看看,在这里,我使用了 SE
大小 9 x 6
和 2
次迭代:
cv::Mat morphKernel = cv::getStructuringElement( cv::MORPH_RECT, cv::Size(9, 6) );
int morphIterations = 2;
cv::morphologyEx( grayImg, grayImg, cv::MORPH_DILATE, morphKernel, cv::Point(-1,-1), morphIterations );
这是结果:
我得到了一个独特的块,原始评论是 - 太好了!现在,这是图像中 最大的斑点 。如果我将它减去原始二值图像,我应该生成一个 mask,它将成功隔离所有不是“评论”blob 的东西:
cv::Mat bigBlob = findBiggestBlob( grayImg );
我明白了:
现在,二进制掩码代:
cv::Mat binaryMask = grayImg - bigBlob;
//Use the binaryMask to produce the final image:
cv::Mat resultImg;
imageInput.copyTo( resultImg, binaryMask );
生成蒙版图像:
现在,您应该已经注意到 findBiggestBlob
函数了。这是我使 returns 成为二值图像中最大斑点的函数。这个想法只是计算输入图像中的所有轮廓,计算它们的面积并存储束中面积最大的轮廓。这是 C++
实现:
//Function to get the largest blob in a binary image:
cv::Mat findBiggestBlob( cv::Mat &inputImage ){
cv::Mat biggestBlob = inputImage.clone();
int largest_area = 0;
int largest_contour_index=0;
std::vector< std::vector<cv::Point> > contours; // Vector for storing contour
std::vector<cv::Vec4i> hierarchy;
// Find the contours in the image
cv::findContours( biggestBlob, contours, hierarchy,CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE );
for( int i = 0; i< (int)contours.size(); i++ ) {
//Find the area of the contour
double a = cv::contourArea( contours[i],false);
//Store the index of largest contour:
if( a > largest_area ){
largest_area = a;
largest_contour_index = i;
}
}
//Once you get the biggest blob, paint it black:
cv::Mat tempMat = biggestBlob.clone();
cv::drawContours( tempMat, contours, largest_contour_index, cv::Scalar(0),
CV_FILLED, 8, hierarchy );
//Erase the smaller blobs:
biggestBlob = biggestBlob - tempMat;
tempMat.release();
return biggestBlob;
}
编辑: 自发布答案以来,我一直在学习 Python
。这是 C++
代码的 Python
等价物:
import cv2
import numpy as np
# Set image path
path = "D://opencvImages//"
fileName = "commentImage.png"
# Read Input image
inputImage = cv2.imread(path+fileName)
# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Threshold via Otsu + bias adjustment:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# Set kernel (structuring element) size:
kernelSize = (9, 6)
# Set operation iterations:
opIterations = 2
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, kernelSize)
# Perform Dilate:
openingImage = cv2.morphologyEx(binaryImage, cv2.MORPH_DILATE, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
# Find the big contours/blobs on the filtered image:
biggestBlob = openingImage.copy()
contours, hierarchy = cv2.findContours(biggestBlob, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
contoursPoly = [None] * len(contours)
boundRect = []
largestArea = 0
largestContourIndex = 0
# Loop through the contours, store the biggest one:
for i, c in enumerate(contours):
# Get the area for the current contour:
currentArea = cv2.contourArea(c, False)
# Store the index of largest contour:
if currentArea > largestArea:
largestArea = currentArea
largestContourIndex = i
# Once you get the biggest blob, paint it black:
tempMat = biggestBlob.copy()
# Draw the contours on the mask image:
cv2.drawContours(tempMat, contours, largestContourIndex, (0, 0, 0), -1, 8, hierarchy)
# Erase the smaller blobs:
biggestBlob = biggestBlob - tempMat
# Generate the binary mask:
binaryMask = openingImage - biggestBlob
# Use the binaryMask to produce the final image:
resultImg = cv2.bitwise_and(inputImage, inputImage, mask = binaryMask)
cv2.imshow("Result", resultImg)
cv2.waitKey(0)
我目前正在研究如何检测图像上的文本段落以将其删除。
我得到一个输入图像,它与上面给出的图像相似。从那里开始,我想检测评论的 comment/the 消息的正文。不需要点赞、用户名和头像,应将其忽略。然后应从评论中删除正文,但其余部分应保留。
到目前为止我添加了一个阈值并找到了轮廓。问题是评论正文没有被检测为一个部分,而是被检测为各种轮廓。 如何组合它们?此外,我想在找到轮廓后立即将其从图像中删除。背景颜色是 RGB(17, 17, 17),有没有办法在它上面绘画 或者它在 OpenCv 中如何工作?我对它很陌生。
img = cv2.imread("Comment.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, threshold = cv2.threshold(gray, 80, 255, cv2.THRESH_BINARY)
contours, _ = cv2.findContours(threshold, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
结果应该是这样的
感谢帮助,提前致谢!
这个想法真的很简单。使用 morphology 来隔离要检测的文本。使用此图像,创建一个 mask 以删除输入图像中的感兴趣区域并生成最终图像。全部通过形态学。我的答案在C++
,但是实现起来真的很简单:
//Read input image:
std::string imagePath = "C://opencvImages//commentImage.png";
cv::Mat imageInput= cv::imread( imagePath );
//Convert it to grayscale:
cv::Mat grayImg;
cv::cvtColor( imageInput, grayImg, cv::COLOR_BGR2GRAY );
//Get binary image via Otsu:
cv::threshold( grayImg, grayImg, 0, 255 , cv::THRESH_OTSU );
到目前为止,您已经生成了二进制图像。现在,让我们 dilate
图像使用 矩形结构元素 (SE
) 宽大于高。我的想法是,我想水平连接所有文本 AND 垂直(一点点)。如果你看到输入图像,“TEST132212” 文本与评论仅分开一点点,似乎足以在 dilate
操作中幸存下来。让我们看看,在这里,我使用了 SE
大小 9 x 6
和 2
次迭代:
cv::Mat morphKernel = cv::getStructuringElement( cv::MORPH_RECT, cv::Size(9, 6) );
int morphIterations = 2;
cv::morphologyEx( grayImg, grayImg, cv::MORPH_DILATE, morphKernel, cv::Point(-1,-1), morphIterations );
这是结果:
我得到了一个独特的块,原始评论是 - 太好了!现在,这是图像中 最大的斑点 。如果我将它减去原始二值图像,我应该生成一个 mask,它将成功隔离所有不是“评论”blob 的东西:
cv::Mat bigBlob = findBiggestBlob( grayImg );
我明白了:
现在,二进制掩码代:
cv::Mat binaryMask = grayImg - bigBlob;
//Use the binaryMask to produce the final image:
cv::Mat resultImg;
imageInput.copyTo( resultImg, binaryMask );
生成蒙版图像:
现在,您应该已经注意到 findBiggestBlob
函数了。这是我使 returns 成为二值图像中最大斑点的函数。这个想法只是计算输入图像中的所有轮廓,计算它们的面积并存储束中面积最大的轮廓。这是 C++
实现:
//Function to get the largest blob in a binary image:
cv::Mat findBiggestBlob( cv::Mat &inputImage ){
cv::Mat biggestBlob = inputImage.clone();
int largest_area = 0;
int largest_contour_index=0;
std::vector< std::vector<cv::Point> > contours; // Vector for storing contour
std::vector<cv::Vec4i> hierarchy;
// Find the contours in the image
cv::findContours( biggestBlob, contours, hierarchy,CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE );
for( int i = 0; i< (int)contours.size(); i++ ) {
//Find the area of the contour
double a = cv::contourArea( contours[i],false);
//Store the index of largest contour:
if( a > largest_area ){
largest_area = a;
largest_contour_index = i;
}
}
//Once you get the biggest blob, paint it black:
cv::Mat tempMat = biggestBlob.clone();
cv::drawContours( tempMat, contours, largest_contour_index, cv::Scalar(0),
CV_FILLED, 8, hierarchy );
//Erase the smaller blobs:
biggestBlob = biggestBlob - tempMat;
tempMat.release();
return biggestBlob;
}
编辑: 自发布答案以来,我一直在学习 Python
。这是 C++
代码的 Python
等价物:
import cv2
import numpy as np
# Set image path
path = "D://opencvImages//"
fileName = "commentImage.png"
# Read Input image
inputImage = cv2.imread(path+fileName)
# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Threshold via Otsu + bias adjustment:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# Set kernel (structuring element) size:
kernelSize = (9, 6)
# Set operation iterations:
opIterations = 2
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, kernelSize)
# Perform Dilate:
openingImage = cv2.morphologyEx(binaryImage, cv2.MORPH_DILATE, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
# Find the big contours/blobs on the filtered image:
biggestBlob = openingImage.copy()
contours, hierarchy = cv2.findContours(biggestBlob, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
contoursPoly = [None] * len(contours)
boundRect = []
largestArea = 0
largestContourIndex = 0
# Loop through the contours, store the biggest one:
for i, c in enumerate(contours):
# Get the area for the current contour:
currentArea = cv2.contourArea(c, False)
# Store the index of largest contour:
if currentArea > largestArea:
largestArea = currentArea
largestContourIndex = i
# Once you get the biggest blob, paint it black:
tempMat = biggestBlob.copy()
# Draw the contours on the mask image:
cv2.drawContours(tempMat, contours, largestContourIndex, (0, 0, 0), -1, 8, hierarchy)
# Erase the smaller blobs:
biggestBlob = biggestBlob - tempMat
# Generate the binary mask:
binaryMask = openingImage - biggestBlob
# Use the binaryMask to produce the final image:
resultImg = cv2.bitwise_and(inputImage, inputImage, mask = binaryMask)
cv2.imshow("Result", resultImg)
cv2.waitKey(0)