应该对算法/代码进行哪些更改才能获得预期的车辆形状?

What changes should be made in algorithm/ code to get expected shapes of vehicles?

我正在从事一个图像处理项目,该项目基于仅相位重建的重要性。有关更多信息,您可以阅读 geometrikal 在 https://dsp.stackexchange.com/questions/16462/how-moving-part-pixel-intensity-values-of-video-frames-becomes-dominant-compared

中给出的答案

我要

Detect moving objects from the video of traffic on road ( Please download the 1.47 MB video by ( step1) click on the play button then (step2) right clicking on video then ( step3 ) click on save as option )

它的算法是

提议的方法

要求:从视频中提取的输入图像序列 I(x, y, n)(其中 x 和 y 是图像维度,n 表示视频中的帧数)。

结果:每帧运动物体的分割掩码

  1. For each frame in a input video perform step 2, append step 2 result in resultant array ‘I(x, y, n)’

  2. Smoothen the current frame using 2D Gaussian filter

  3. Perform 3D FFT for the whole sequence I(x, y, n) using (Eq.4.1)

  4. Calculate the phase spectrum using the real and imaginary parts of 3D DFT

  5. Calculate the reconstructed sequence Î(x, y, n) using (Eq.4.2)

  6. For each frame in a input video perform step 7 to step 10 to get segmentation mask for each frame and append step 10 result in resultant segmentation mask array BW(x,y,n)’

  7. Smooth the reconstructed frame of Î(x, y, n) using the averaging filter.

  8. Compute the mean value of the current frame

  9. Convert the current frame into binary image using mean value as the threshold

  10. Perform morphological processing, i.e., filling and closing, to obtain segmented mask of moving object for the current frame

  11. End algorithm.

通过上述算法,我可以从视频中找到所有移动物体。

但问题是我得到的车辆分段掩码没有我期望的正确形状。

所以有人可以帮助我得到 预期 形状吗?

  1. What changes should I make in the algorithm?

  1. What changes should I make in MATLAB code ?
    tic
clc;
clear all;
close all;
  
%read video file
video = VideoReader('D:\dvd\Matlab code\test videos.mp4');

T= video.NumberOfFrames  ;           %number of frames%

frameHeight = video.Height;          %frame height

frameWidth = video.Width ;           %frameWidth

get(video);                          %return graphics properties of video


i=1;

for t=300:15:550  %select frames between 300 to 550 with interval of 15 from the video  
    frame_x(:,:,:,i)= read(video, t); 
    frame_y=frame_x(:,:,:,i);

    %figure,
    %imshow(f1),title(['test frames :' num2str(i)]);
    frame_z=rgb2gray(frame_y);                 %convert each colour frame into gray
    
    frame_m(:,:,:,i)=frame_y; %Store colour frames in the frame_m array 
     
    %Perform Gaussian Filtering
    h1=(1/8)*(1/8)*[1 3 3 1]'*[1 3 3 1]  ;   % 4*4 Gaussian Kernel  
    convn=conv2(frame_z,h1,'same');
        
    g1=uint8(convn);
    
                    
    Filtered_Image_Array(:,:,i)=g1; %Store filtered images into an array
    i=i+1;
end

%Apply 3-D Fourier Transform on video sequences
f_transform=fftn(Filtered_Image_Array);

%Compute phase spectrum array from f_transform
phase_spectrum_array =exp(1j*angle(f_transform));

%Apply 3-D Inverse Fourier Transform on phase spectrum array and
%reconstruct the frames
reconstructed_frame_array=(ifftn(phase_spectrum_array));


k=i;

i=1;
for t=1:k-1
    
    %Smooth the reconstructed frame of Î(x, y, n) using the averaging filter.
    Reconstructed_frame_magnitude=abs(reconstructed_frame_array(:,:,t));  
    H = fspecial('disk',4);
    circular_avg(:,:,t) = imfilter(Reconstructed_frame_magnitude,H);
        
    
    %Convert the current frame into binary image using mean value as the threshold
    mean_value=mean2(circular_avg(:,:,t));  
    binary_frame = im2bw(circular_avg(:,:,t),1.6*mean_value);
    
    
    %Perform Morphological operations
    se = strel('square',3);
    morphological_closing = imclose(binary_frame,se); 
    morphological_closing=imclearborder(morphological_closing); %clear noise present at the borders of the frames
    
    
    %Superimpose segmented masks on it's respective frames to obtain moving
    %objects
    moving_object_frame = frame_m(:,:,:,i);
    moving_object_frame(morphological_closing) = 255;  
    figure,
    imshow(moving_object_frame,[]), title(['Moving objects in Frame :' num2str(i)]);
    
 i=i+1;
end
toc

我不了解算法的细节(顺便说一句,您可以通过使用比 f1、f2、f7、mean1、mean2 等更有意义的名称来提高知名度),但您的问题似乎是固有的使用的技术。

通过使用 FFT 的相位,您可以独立处理每个像素,而无需任何轮廓感知。不过,您可以做的是稍微调整阈值(此处固定为平均值)并查看它的响应方式。

另一种选择是 post 通过尝试识别图像中的预期形状来处理当前结果(请参阅最大期望算法)。

你的限制是什么?