使用 Hough 变换自动去除直线
Automatically remove straight lines with Hough transform
我正在做关于光学字符识别的论文。我的工作是从图像中正确分割文本字符。
问题是,这种语言中的每一行文本都有单词,其中字符通常由直线连接。这些线的粗细可能相同,也可能不同。
到目前为止,使用投影配置文件,我已经能够分割不与任何直线相连的字符。但是要分割由直线连接的字符,我必须删除这些线。我更喜欢使用霍夫变换来检测和删除那些线条(意思是在黑白图像中,如果线条中的像素是黑色的,则将其设为白色)。
查看包含文字的示例图片:
Sample Image
This是使用投影剖面从上图中分割出的一条线。
和These是使用霍夫变换检测到的线。
霍夫变换代码。使用 This 图片进行测试。
I = imread('line0.jpg');
%I = rgb2gray(I);
BW = edge(I,'canny');
[H,T,R] = hough(BW);
imshow(H,[],'XData',T,'YData',R,'InitialMagnification','fit');
xlabel('\theta'),ylabel('\rho');
axis on, axis normal, hold on;
P = houghpeaks(H,1,'threshold',ceil(0.3*max(H(:))));
x = T(P(:,2));
y = R(P(:,1));
plot(x,y,'s','color','blue');
% Find lines and plot them
lines = houghlines(BW,T,R,P,'FillGap',5,'MinLength',7);
figure, imshow(I), hold on
grid on
max_len = 0;
for k = 1:length(lines)
xy = [lines(k).point1;lines(k).point2];
plot(xy(:,1),xy(:,2),'LineWidth',1,'Color','green');
% plot beginnings and ends of lines
plot(xy(1,1),xy(1,2),'o','LineWidth',2,'Color','red');
plot(xy(2,1),xy(2,2),'o','LineWidth',2,'Color','blue');
% determine the endpoints of the longest line segment
len = norm(lines(k).point1 - lines(k).point2);
if( len > max_len )
max_len = len;
xy_long = xy;
end
end
关于我该怎么做的任何想法?任何帮助将不胜感激!
从 houghlines
开始,您只需将行的索引替换为白色(在本例中为 255)。您可能需要稍微调整一下填充,以减少一两个像素。
编辑:这是一个尝试确定填充的版本。
%% OCR
I = imread('CEBML.jpg');
BW = edge(I,'canny');
[H,T,R] = hough(BW);
P = houghpeaks(H,1,'threshold',ceil(0.3*max(H(:))));
x = T(P(:,2));
y = R(P(:,1));
% Find lines and plot them
lines = houghlines(BW,T,R,P,'FillGap',5,'MinLength',7);
subplot(2,1,1)
grid on
imshow(I)
title('Input')
hold on
px = 5; % Number of padding pixels to probe
white_threshold = 30; % White threshold
ln_length = .6; % 60 %
for k = 1:length(lines)
xy = [lines(k).point1; lines(k).point2];
buf_y = xy(1,1):xy(2,1); % Assuming it's a straight line!
buf_x = [repmat(xy(1,2),1,xy(2,1) - xy(1,1)),xy(2,2)] + [-px:px]';
I_idx = sub2ind(size(I),buf_x, repmat(buf_y,size(buf_x,1),1));
% Consider lines that are below white threshold, and are longer than xx
% of the found line.
idx = sum(I(I_idx) <= white_threshold,2) >= ln_length * size(I_idx,2);
I(I_idx(idx,:)) = 255;
% Some visualisation
[ixx,jyy] = ind2sub(size(I),I_idx(idx,:));
plot(jyy,ixx,'.r');% Pixels set to white
plot(xy(:,1),xy(:,2),'-b','LineWidth',2); % Found lines
end
subplot(2,1,2)
grid on
imshow(I)
title('Output')
我正在做关于光学字符识别的论文。我的工作是从图像中正确分割文本字符。
问题是,这种语言中的每一行文本都有单词,其中字符通常由直线连接。这些线的粗细可能相同,也可能不同。
到目前为止,使用投影配置文件,我已经能够分割不与任何直线相连的字符。但是要分割由直线连接的字符,我必须删除这些线。我更喜欢使用霍夫变换来检测和删除那些线条(意思是在黑白图像中,如果线条中的像素是黑色的,则将其设为白色)。
查看包含文字的示例图片: Sample Image
This是使用投影剖面从上图中分割出的一条线。
和These是使用霍夫变换检测到的线。
霍夫变换代码。使用 This 图片进行测试。
I = imread('line0.jpg');
%I = rgb2gray(I);
BW = edge(I,'canny');
[H,T,R] = hough(BW);
imshow(H,[],'XData',T,'YData',R,'InitialMagnification','fit');
xlabel('\theta'),ylabel('\rho');
axis on, axis normal, hold on;
P = houghpeaks(H,1,'threshold',ceil(0.3*max(H(:))));
x = T(P(:,2));
y = R(P(:,1));
plot(x,y,'s','color','blue');
% Find lines and plot them
lines = houghlines(BW,T,R,P,'FillGap',5,'MinLength',7);
figure, imshow(I), hold on
grid on
max_len = 0;
for k = 1:length(lines)
xy = [lines(k).point1;lines(k).point2];
plot(xy(:,1),xy(:,2),'LineWidth',1,'Color','green');
% plot beginnings and ends of lines
plot(xy(1,1),xy(1,2),'o','LineWidth',2,'Color','red');
plot(xy(2,1),xy(2,2),'o','LineWidth',2,'Color','blue');
% determine the endpoints of the longest line segment
len = norm(lines(k).point1 - lines(k).point2);
if( len > max_len )
max_len = len;
xy_long = xy;
end
end
关于我该怎么做的任何想法?任何帮助将不胜感激!
从 houghlines
开始,您只需将行的索引替换为白色(在本例中为 255)。您可能需要稍微调整一下填充,以减少一两个像素。
编辑:这是一个尝试确定填充的版本。
%% OCR
I = imread('CEBML.jpg');
BW = edge(I,'canny');
[H,T,R] = hough(BW);
P = houghpeaks(H,1,'threshold',ceil(0.3*max(H(:))));
x = T(P(:,2));
y = R(P(:,1));
% Find lines and plot them
lines = houghlines(BW,T,R,P,'FillGap',5,'MinLength',7);
subplot(2,1,1)
grid on
imshow(I)
title('Input')
hold on
px = 5; % Number of padding pixels to probe
white_threshold = 30; % White threshold
ln_length = .6; % 60 %
for k = 1:length(lines)
xy = [lines(k).point1; lines(k).point2];
buf_y = xy(1,1):xy(2,1); % Assuming it's a straight line!
buf_x = [repmat(xy(1,2),1,xy(2,1) - xy(1,1)),xy(2,2)] + [-px:px]';
I_idx = sub2ind(size(I),buf_x, repmat(buf_y,size(buf_x,1),1));
% Consider lines that are below white threshold, and are longer than xx
% of the found line.
idx = sum(I(I_idx) <= white_threshold,2) >= ln_length * size(I_idx,2);
I(I_idx(idx,:)) = 255;
% Some visualisation
[ixx,jyy] = ind2sub(size(I),I_idx(idx,:));
plot(jyy,ixx,'.r');% Pixels set to white
plot(xy(:,1),xy(:,2),'-b','LineWidth',2); % Found lines
end
subplot(2,1,2)
grid on
imshow(I)
title('Output')