如何证明高斯金字塔的脉冲响应是尺度不变的?
How to demonstrate that the impulse response of the Gaussian Pyramid is Scale Invariant?
我用中心有一个狄拉克脉冲的 512x512
图像构建了一个高斯金字塔 (256,256),然后尝试按照以下过程来证明这个金字塔是尺度不变的,并且它具有每个级别的脉冲响应相同,但结果似乎不太正确!
你能告诉我怎么做吗?
编辑:
我编辑了代码以修复一些错误,感谢@CrisLuengo 的注释。
代码:
import numpy as np
import matplotlib.pyplot as plt
import cv2
import skimage.exposure as exposure
from math import sqrt, ceil
#=================
# Resize Function
#=================
def _resize(image, downscale=2, step=0.5, minSize=(7, 7)):
if(image.shape > minSize ):
# newSize = (image.shape[0]// downscale, image.shape[1]//downscale)
# newImage = cv2.resize(image, dsize=newSize, fx=step, fy=step)
newImage = cv2.resize(image, None, fx=step, fy=step)
return newImage
else:
return 0
#--------------------------------------------------------------
#===========================
# Gaussian Pyramid Function
#===========================
def pyramid(image, sigma_0=1):
'''
Function to create a Gaussian pyramid from an image for given standard deviation sigma_0
Parameters:
-----------
@param: image: nd-array.
The original image.
@param: sigma_0: float.
standard deviation of the Gaussian distribution.
returns:
List of images with different scales, the pyramid
'''
# Resize All input images into a standard size
image = cv2.resize(image,(512,512))
# level 0
if ceil(6*sigma_0)%2 ==0 :
Gimage = cv2.GaussianBlur(image, (ceil(6*sigma_0)+1, ceil(6*sigma_0)+1), sigmaX=sigma_0, sigmaY=sigma_0)
else:
Gimage = cv2.GaussianBlur(image, (ceil(6*sigma_0)+2, ceil(6*sigma_0)+2), sigmaX=sigma_0, sigmaY=sigma_0)
# sigma_k
sigma_k = 4*sigma_0
# sigma_k = sqrt(2)*sigma_0
# Pyramid as list
GaussPyr = [Gimage]
# Loop of other levels of the pyramid
for k in range(1,6):
if ceil(6*sigma_k)%2 ==0 :
# smoothed = cv2.GaussianBlur(GaussPyr[k-1], (ceil(6*sigma_k)+1, ceil(6*sigma_k)+1), sigmaX=sigma_k, sigmaY=sigma_0)
smoothed = cv2.GaussianBlur(GaussPyr[k-1], (ceil(6*sigma_k)+1, ceil(6*sigma_k)+1), sigmaX=sigma_k, sigmaY=sigma_k)
else:
# smoothed = cv2.GaussianBlur(GaussPyr[k-1], (ceil(6*sigma_k)+2, ceil(6*sigma_k)+2), sigmaX=sigma_k, sigmaY=sigma_0)
smoothed = cv2.GaussianBlur(GaussPyr[k-1], (ceil(6*sigma_k)+2, ceil(6*sigma_k)+2), sigmaX=sigma_k, sigmaY=sigma_k)
# Downscaled Image
resized = _resize(smoothed ) # ,step=0.25*sigma_k
GaussPyr.append(resized)
return GaussPyr
#====================
# Impulse Response
#====================
# Zeros 512x512 Black Image
delta = np.zeros((512, 512), dtype=np.float32)
# Dirac
delta[255,255] = 255
# sigmas
sigma1 = 1
sigma2 = sqrt(2)
# Pyramids
deltaPyramid1 = pyramid(delta, sigma_0=sigma1)
deltaPyramid2 = pyramid(delta, sigma_0=sigma2)
# Impulse Response for each level
ImpResp1 = np.zeros((len(deltaPyramid1), 13),dtype=float)
ImpResp2 = np.zeros((len(deltaPyramid2), 13),dtype=float)
# sigma = 1
for idx, level in enumerate(deltaPyramid1):
# # 1
# level = cv2.resize(level, (512, 512))# , interpolation=cv2.INTER_AREA
# ImpResp1[idx,:] = exposure.rescale_intensity(level[255, 249:262], in_range='image', out_range=(0,255)).astype(np.uint8)
# ImpResp1[idx,:] = level[255, 249:262]
# # 2
centery = level.shape[0]//2
centerx = level.shape[1]//2
ImpResp1[idx,:] = exposure.rescale_intensity(level[centery, (centerx-7):(centerx+6)], out_range=(0,255), in_range='image').astype(np.uint8)
# ImpResp1[idx,:] = level[centery, (centerx-7):(centerx+6)]
# sigma = sqrt(2)
for idx, level in enumerate(deltaPyramid2):
# # 1
# level = cv2.resize(level, (512, 512))# , interpolation=cv2.INTER_AREA
# ImpResp2[idx,:] = exposure.rescale_intensity(level[255, 249:262], in_range='image', out_range=(0,255)).astype(np.uint8)
# ImpResp2[idx,:] = level[255, 249:262]
# # 2
centery = level.shape[0]//2
centerx = level.shape[1]//2
ImpResp2[idx,:] = exposure.rescale_intensity(level[centery, (centerx-7):(centerx+6)], out_range=(0,255), in_range='image').astype(np.uint8)
# ImpResp2[idx,:] = level[centery, (centerx-7):(centerx+6)]
#====================
# Visualize Results
#====================
labels = []
for c in range(13):
label = 'C{}'.format(c+1)
labels.append(label)
x = np.arange(len(labels)) # the label locations
width = 0.1 # the width of the bars
fig, ax = plt.subplots()
rects1 = []
for k in range(ImpResp1.shape[0]):
rects1.append(ax.bar(x - 2*k*width, ImpResp1[k], width, label='K{}'.format(k)))
# Add some text for labels, title and custom x-axis tick labels, etc.
ax.set_ylabel('values')
ax.set_title('sigma0=1')
ax.set_xticks(x)
ax.set_xticklabels(labels)
ax.legend()
fig.tight_layout()
fig2, ax2 = plt.subplots()
rects2 = []
for k in range(ImpResp1.shape[0]):
rects2.append(ax2.bar(x + 2*k*width, ImpResp2[k], width, label='K{}'.format(k)))
# Add some text for labels, title and custom x-axis tick labels, etc.
ax2.set_ylabel('values')
ax2.set_title('sigma0=sqrt(2)')
ax2.set_xticks(x)
ax2.set_xticklabels(labels)
ax2.legend()
fig2.tight_layout()
plt.show()
首先,让我们简化到一个足够简单的情况,可以看到高斯的缩放属性。将增量图像与高斯进行卷积会产生该高斯。高斯 B 的大小是高斯 A 的两倍,然后在空间上缩放一半,与 A 相同(当然,直到强度缩放,B 在 2D 中是 A 的 1/4)。
delta = <all zeros except one pixel in the middle>
A = GaussianBlur(delta, 1)
B = GaussianBlur(delta, 2)
B = resize(B, 1/2)
A == B * 2**2
C = GaussianBlur(delta, sigma=7.489)
C = resize(C, 1/7.489)
A == C * 7.489**2
现在,如果我们链接模糊操作,我们将获得更强的模糊效果。输出 sigma 的平方等于应用的 sigma 的平方和:
A = GaussianBlur(delta, 1)
B = GaussianBlur(delta, 2)
C = GaussianBlur(A, sqrt(3))
B == C
即1**2 + sqrt(3)**2 = 2**2
.
因此,在金字塔的每一步,我们都需要计算已经应用了多少模糊,并应用正确的量以达到必要的模糊程度。每次我们模糊时,我们都会将模糊增加给定的量,每次我们重新缩放时我们都会将模糊减少给定的量。
如果sigma0
是初始平滑,sigma1
是降尺度前应用的平滑,降尺度系数k>1,那么这个关系:
sqrt(sigma0**2 + sigma1**2) / k == sigma0
将确保缩小后的增量图像与原始平滑增量图像相同(直至强度缩放)。我们得到:
sigma1 = sqrt((sigma0 * k)**2 - sigma0**2)
(如果我做对了,在我的 phone 屏幕上)。
由于我们返回到与原始图像相同的图像,因此后续的金字塔级别将使用这些相同的值。
我在您的代码中注意到的另一个问题是,您在开始处理之前将增量图像重新缩放为“标准尺寸”。不要这样做,delta image 将不再是 delta image,上面的关系将不再成立。输入必须恰好有一个像素设置为 1,其余为 0。
我用中心有一个狄拉克脉冲的 512x512
图像构建了一个高斯金字塔 (256,256),然后尝试按照以下过程来证明这个金字塔是尺度不变的,并且它具有每个级别的脉冲响应相同,但结果似乎不太正确!
编辑:
我编辑了代码以修复一些错误,感谢@CrisLuengo 的注释。
代码:
import numpy as np
import matplotlib.pyplot as plt
import cv2
import skimage.exposure as exposure
from math import sqrt, ceil
#=================
# Resize Function
#=================
def _resize(image, downscale=2, step=0.5, minSize=(7, 7)):
if(image.shape > minSize ):
# newSize = (image.shape[0]// downscale, image.shape[1]//downscale)
# newImage = cv2.resize(image, dsize=newSize, fx=step, fy=step)
newImage = cv2.resize(image, None, fx=step, fy=step)
return newImage
else:
return 0
#--------------------------------------------------------------
#===========================
# Gaussian Pyramid Function
#===========================
def pyramid(image, sigma_0=1):
'''
Function to create a Gaussian pyramid from an image for given standard deviation sigma_0
Parameters:
-----------
@param: image: nd-array.
The original image.
@param: sigma_0: float.
standard deviation of the Gaussian distribution.
returns:
List of images with different scales, the pyramid
'''
# Resize All input images into a standard size
image = cv2.resize(image,(512,512))
# level 0
if ceil(6*sigma_0)%2 ==0 :
Gimage = cv2.GaussianBlur(image, (ceil(6*sigma_0)+1, ceil(6*sigma_0)+1), sigmaX=sigma_0, sigmaY=sigma_0)
else:
Gimage = cv2.GaussianBlur(image, (ceil(6*sigma_0)+2, ceil(6*sigma_0)+2), sigmaX=sigma_0, sigmaY=sigma_0)
# sigma_k
sigma_k = 4*sigma_0
# sigma_k = sqrt(2)*sigma_0
# Pyramid as list
GaussPyr = [Gimage]
# Loop of other levels of the pyramid
for k in range(1,6):
if ceil(6*sigma_k)%2 ==0 :
# smoothed = cv2.GaussianBlur(GaussPyr[k-1], (ceil(6*sigma_k)+1, ceil(6*sigma_k)+1), sigmaX=sigma_k, sigmaY=sigma_0)
smoothed = cv2.GaussianBlur(GaussPyr[k-1], (ceil(6*sigma_k)+1, ceil(6*sigma_k)+1), sigmaX=sigma_k, sigmaY=sigma_k)
else:
# smoothed = cv2.GaussianBlur(GaussPyr[k-1], (ceil(6*sigma_k)+2, ceil(6*sigma_k)+2), sigmaX=sigma_k, sigmaY=sigma_0)
smoothed = cv2.GaussianBlur(GaussPyr[k-1], (ceil(6*sigma_k)+2, ceil(6*sigma_k)+2), sigmaX=sigma_k, sigmaY=sigma_k)
# Downscaled Image
resized = _resize(smoothed ) # ,step=0.25*sigma_k
GaussPyr.append(resized)
return GaussPyr
#====================
# Impulse Response
#====================
# Zeros 512x512 Black Image
delta = np.zeros((512, 512), dtype=np.float32)
# Dirac
delta[255,255] = 255
# sigmas
sigma1 = 1
sigma2 = sqrt(2)
# Pyramids
deltaPyramid1 = pyramid(delta, sigma_0=sigma1)
deltaPyramid2 = pyramid(delta, sigma_0=sigma2)
# Impulse Response for each level
ImpResp1 = np.zeros((len(deltaPyramid1), 13),dtype=float)
ImpResp2 = np.zeros((len(deltaPyramid2), 13),dtype=float)
# sigma = 1
for idx, level in enumerate(deltaPyramid1):
# # 1
# level = cv2.resize(level, (512, 512))# , interpolation=cv2.INTER_AREA
# ImpResp1[idx,:] = exposure.rescale_intensity(level[255, 249:262], in_range='image', out_range=(0,255)).astype(np.uint8)
# ImpResp1[idx,:] = level[255, 249:262]
# # 2
centery = level.shape[0]//2
centerx = level.shape[1]//2
ImpResp1[idx,:] = exposure.rescale_intensity(level[centery, (centerx-7):(centerx+6)], out_range=(0,255), in_range='image').astype(np.uint8)
# ImpResp1[idx,:] = level[centery, (centerx-7):(centerx+6)]
# sigma = sqrt(2)
for idx, level in enumerate(deltaPyramid2):
# # 1
# level = cv2.resize(level, (512, 512))# , interpolation=cv2.INTER_AREA
# ImpResp2[idx,:] = exposure.rescale_intensity(level[255, 249:262], in_range='image', out_range=(0,255)).astype(np.uint8)
# ImpResp2[idx,:] = level[255, 249:262]
# # 2
centery = level.shape[0]//2
centerx = level.shape[1]//2
ImpResp2[idx,:] = exposure.rescale_intensity(level[centery, (centerx-7):(centerx+6)], out_range=(0,255), in_range='image').astype(np.uint8)
# ImpResp2[idx,:] = level[centery, (centerx-7):(centerx+6)]
#====================
# Visualize Results
#====================
labels = []
for c in range(13):
label = 'C{}'.format(c+1)
labels.append(label)
x = np.arange(len(labels)) # the label locations
width = 0.1 # the width of the bars
fig, ax = plt.subplots()
rects1 = []
for k in range(ImpResp1.shape[0]):
rects1.append(ax.bar(x - 2*k*width, ImpResp1[k], width, label='K{}'.format(k)))
# Add some text for labels, title and custom x-axis tick labels, etc.
ax.set_ylabel('values')
ax.set_title('sigma0=1')
ax.set_xticks(x)
ax.set_xticklabels(labels)
ax.legend()
fig.tight_layout()
fig2, ax2 = plt.subplots()
rects2 = []
for k in range(ImpResp1.shape[0]):
rects2.append(ax2.bar(x + 2*k*width, ImpResp2[k], width, label='K{}'.format(k)))
# Add some text for labels, title and custom x-axis tick labels, etc.
ax2.set_ylabel('values')
ax2.set_title('sigma0=sqrt(2)')
ax2.set_xticks(x)
ax2.set_xticklabels(labels)
ax2.legend()
fig2.tight_layout()
plt.show()
首先,让我们简化到一个足够简单的情况,可以看到高斯的缩放属性。将增量图像与高斯进行卷积会产生该高斯。高斯 B 的大小是高斯 A 的两倍,然后在空间上缩放一半,与 A 相同(当然,直到强度缩放,B 在 2D 中是 A 的 1/4)。
delta = <all zeros except one pixel in the middle>
A = GaussianBlur(delta, 1)
B = GaussianBlur(delta, 2)
B = resize(B, 1/2)
A == B * 2**2
C = GaussianBlur(delta, sigma=7.489)
C = resize(C, 1/7.489)
A == C * 7.489**2
现在,如果我们链接模糊操作,我们将获得更强的模糊效果。输出 sigma 的平方等于应用的 sigma 的平方和:
A = GaussianBlur(delta, 1)
B = GaussianBlur(delta, 2)
C = GaussianBlur(A, sqrt(3))
B == C
即1**2 + sqrt(3)**2 = 2**2
.
因此,在金字塔的每一步,我们都需要计算已经应用了多少模糊,并应用正确的量以达到必要的模糊程度。每次我们模糊时,我们都会将模糊增加给定的量,每次我们重新缩放时我们都会将模糊减少给定的量。
如果sigma0
是初始平滑,sigma1
是降尺度前应用的平滑,降尺度系数k>1,那么这个关系:
sqrt(sigma0**2 + sigma1**2) / k == sigma0
将确保缩小后的增量图像与原始平滑增量图像相同(直至强度缩放)。我们得到:
sigma1 = sqrt((sigma0 * k)**2 - sigma0**2)
(如果我做对了,在我的 phone 屏幕上)。
由于我们返回到与原始图像相同的图像,因此后续的金字塔级别将使用这些相同的值。
我在您的代码中注意到的另一个问题是,您在开始处理之前将增量图像重新缩放为“标准尺寸”。不要这样做,delta image 将不再是 delta image,上面的关系将不再成立。输入必须恰好有一个像素设置为 1,其余为 0。