Cython:Python 的 view_as_windows 与手动算法的性能对比?
Cython: Performance in Python of view_as_windows vs manual algorithm?
我的环境是 OS:Ubuntu 和语言:Python + Cython。
我有点不知所措。我正在使用 view_as_windows 来分割图像,并 return 给我一个由切片创建的所有补丁的数组。我还创建了一个算法,它做几乎相同的事情来更好地控制切片。我已经测试了这两种算法并且它们创建了我想要的结果,我现在的问题是我需要更快的性能所以我正在尝试对事物进行 cythonize。我对 Cython 很陌生,所以我实际上还没有做任何更改。
view_as_windows 每张图片的时间:0.0033s
patches_by_col每张图片时间:0.057s
问题:
鉴于这些 运行 次,我会通过 cythonizing 手动算法获得更好的性能还是继续使用 view_as_windows?
我问是因为我不认为我可以 cythonize view_as_windows 因为它是从 numpy 调用的。我正在测试禁用可变步幅(strideDivisor == 0 和 imgRegion == 0)。图片尺寸为 1200 x 800。
GetPatchesAndCoordByRow (manual code)
参数:
#Patch Image Settings: Should be 3x2 ratio for width to height
WIDTH = 60
HEIGHT = 40
CHANNELS = 1
ITERATIONS = 7
MULTIPLIER = 1.31
#Stride will be how big of a step each crop takes.
#If you dont want to crops to overlap, do same stride as width of image.
STRIDE = 6
# STRIDE_IMREG_DIV decreases normal stride inside an image region
#Set amount by which to divide stride.
#Ex: 2 would reduce stride by 50%, and generate 200% data
#Ex contd: So it would output 40K patches instead of 20K
#strideDivisor = 1.5
# IMG_REGION determines what % of image region will produce additional patches
#Region of image to focus by decreasing stride. Ex: 0.5 would increase patches in inner 50% of image
#imgRegion = 0.5
# Set STRIDE_IMREG_DIV and IMG_REGION = 0 to disable functionality.
STRIDE_IMREG_DIV = 0
IMG_REGION = 0
源代码:
def setVarStride(x2, y2, maxX, maxY, stride, div, imgReg, var):
imgFocReg1 = imgReg/2
imgFocReg2 = 1 - imgFocReg1
if (var == 'x'):
if ((x2 >= maxX*imgFocReg1) and (x2 <= maxX*imgFocReg2) and (y2 >= maxY*imgFocReg1) and (y2 <= maxY*imgFocReg2)):
vStride = stride/div
else:
vStride = stride
elif (var == 'y'):
if ((y2 >= maxY*imgFocReg1) and (y2 <= maxY*imgFocReg2)):
vStride = stride/div
else:
vStride = stride
return vStride
def GetPatchesAndCoordByRow(image, patchHeight, patchWidth, stride, strideDivisor, imgRegion):
x1 = 0
y1 = 0
x2 = patchWidth
y2 = patchHeight
croppedImageList = []
maxX, maxY = image.size
#Set variable stride to collect more data in a region of the image
varStride = stride
useVaraibleStride = True
if (strideDivisor == 0 and imgRegion == 0):
useVaraibleStride = False
else:
imgConcentration = (1 - imgRegion)*100
print("Variable Stride ENABLED: Create more patches inside {0}% of the image.".format(imgConcentration))
while y2 <= (maxY):
while x2 <= (maxX):
croppedImage = image.crop((x1,y1,x2,y2))
croppedImageList.append((croppedImage,(x1, y1, x2, y2)))
#Get 2x more patches in the center of the image
if (useVaraibleStride):
varStride = setVarStride(x2, y2, maxX, maxY, stride, strideDivisor, imgRegion, 'x')
#Rows
x1 += varStride
x2 += varStride
#--DEBUG
#iX += 1
#print("Row_{4} -> x1: {0}, y1: {1}, x2: {2}, y2: {3}".format(x1, y1, x2, y2,iX))
#Get 2x more patches in the center of the image
if (useVaraibleStride):
varStride = setVarStride(x2, y2, maxX, maxY, stride, strideDivisor, imgRegion, 'y')
#Columns
x1 = 0
x2 = patchWidth
y1 += varStride
y2 += varStride
#--DEBUG
#iY += 1
#print(" Column_{4} -> x1: {0}, y1: {1}, x2: {2}, y2: {3}".format(x1, y1, x2, y2, iY))
#Get patches at edge of image
x1 = 0
x2 = patchWidth
y1 = maxY - patchHeight
y2 = maxY
#Bottom edge patches
while x2 <= (maxX):
#--DEBUG
#iX += 1
#print("Row_{4} -> x1: {0}, y1: {1}, x2: {2}, y2: {3}".format(x1, y1, x2, y2,iX))
#--DEBUG
croppedImage = image.crop((x1,y1,x2,y2))
croppedImageList.append((croppedImage,(x1, y1, x2, y2)))
#Rows
x1 += stride
x2 += stride
#Right edge patches
x1 = maxX - patchWidth
x2 = maxX
y1 = 0
y2 = patchHeight
while y2 <= (maxY):
#--DEBUG
#iY += 1
#print(" Column_{4} -> x1: {0}, y1: {1}, x2: {2}, y2: {3}".format(x1, y1, x2, y2, iY))
#--DEBUG
croppedImage = image.crop((x1,y1,x2,y2))
croppedImageList.append((croppedImage,(x1, y1, x2, y2)))
#Columns
y1 += stride
y2 += stride
#--DEBUG
print("GetPatchesAndCoordByRow (Count={0}, W={1}, H={2}, Stride={3})".format(len(croppedImageList), int(patchWidth), int(patchHeight), int(stride)))
return croppedImageList
view_as_windows code
def CreatePatches(image, patchHeight, patchWidth, stride = 1):
imageArray = numpy.asarray(image)
patches = view_as_windows(imageArray, (patchHeight, patchWidth), stride)
print("Raw Patches initial shape: {0}".format(patches.shape))
return patches
我不认为你能比 view_as_windows
做得更好,因为只要输入数组是连续的,它就已经非常高效了。我怀疑即使是 cythonizing 也会有很大的不同。我研究了它的实现,实际上有点印象:
一个 numpy 数组由一个底层数据数组(例如 char *
)和一个 "strides" 数组组成,每个维度一个,它告诉沿着底层移动多远数组,对于沿该维度的每一步。 The implementation of view_as_windows
通过创建一个与其输入共享相同数据数组的新数组来利用这一点,并简单地插入新的 "strides" 以添加可用于 select 补丁的维度。这意味着它不会像您所说的那样返回 "an array of all the patches",但它只返回一个数组,其第一个维度就像补丁数组中的索引一样。
因此,view_as_windows
不需要复制图像中的任何数据来创建补丁,也不需要为每个补丁创建额外的 ndarray 对象。它需要复制数据的唯一时间是当它的输入数组不连续时(例如,它是一个更大数组的一部分)。即使使用 Cython,我也看不出你能比这做得更好。
在您的实施中,即使假设 image.crop
能够共享图像中的数据,您仍然在创建一个看起来像 1199x799 不同 image
对象的数组。
您是否确认 view_as_windows
是您的算法花费大部分时间的地方?
我的环境是 OS:Ubuntu 和语言:Python + Cython。
我有点不知所措。我正在使用 view_as_windows 来分割图像,并 return 给我一个由切片创建的所有补丁的数组。我还创建了一个算法,它做几乎相同的事情来更好地控制切片。我已经测试了这两种算法并且它们创建了我想要的结果,我现在的问题是我需要更快的性能所以我正在尝试对事物进行 cythonize。我对 Cython 很陌生,所以我实际上还没有做任何更改。
view_as_windows 每张图片的时间:0.0033s
patches_by_col每张图片时间:0.057s
问题:
鉴于这些 运行 次,我会通过 cythonizing 手动算法获得更好的性能还是继续使用 view_as_windows? 我问是因为我不认为我可以 cythonize view_as_windows 因为它是从 numpy 调用的。我正在测试禁用可变步幅(strideDivisor == 0 和 imgRegion == 0)。图片尺寸为 1200 x 800。
GetPatchesAndCoordByRow (manual code)
参数:
#Patch Image Settings: Should be 3x2 ratio for width to height
WIDTH = 60
HEIGHT = 40
CHANNELS = 1
ITERATIONS = 7
MULTIPLIER = 1.31
#Stride will be how big of a step each crop takes.
#If you dont want to crops to overlap, do same stride as width of image.
STRIDE = 6
# STRIDE_IMREG_DIV decreases normal stride inside an image region
#Set amount by which to divide stride.
#Ex: 2 would reduce stride by 50%, and generate 200% data
#Ex contd: So it would output 40K patches instead of 20K
#strideDivisor = 1.5
# IMG_REGION determines what % of image region will produce additional patches
#Region of image to focus by decreasing stride. Ex: 0.5 would increase patches in inner 50% of image
#imgRegion = 0.5
# Set STRIDE_IMREG_DIV and IMG_REGION = 0 to disable functionality.
STRIDE_IMREG_DIV = 0
IMG_REGION = 0
源代码:
def setVarStride(x2, y2, maxX, maxY, stride, div, imgReg, var):
imgFocReg1 = imgReg/2
imgFocReg2 = 1 - imgFocReg1
if (var == 'x'):
if ((x2 >= maxX*imgFocReg1) and (x2 <= maxX*imgFocReg2) and (y2 >= maxY*imgFocReg1) and (y2 <= maxY*imgFocReg2)):
vStride = stride/div
else:
vStride = stride
elif (var == 'y'):
if ((y2 >= maxY*imgFocReg1) and (y2 <= maxY*imgFocReg2)):
vStride = stride/div
else:
vStride = stride
return vStride
def GetPatchesAndCoordByRow(image, patchHeight, patchWidth, stride, strideDivisor, imgRegion):
x1 = 0
y1 = 0
x2 = patchWidth
y2 = patchHeight
croppedImageList = []
maxX, maxY = image.size
#Set variable stride to collect more data in a region of the image
varStride = stride
useVaraibleStride = True
if (strideDivisor == 0 and imgRegion == 0):
useVaraibleStride = False
else:
imgConcentration = (1 - imgRegion)*100
print("Variable Stride ENABLED: Create more patches inside {0}% of the image.".format(imgConcentration))
while y2 <= (maxY):
while x2 <= (maxX):
croppedImage = image.crop((x1,y1,x2,y2))
croppedImageList.append((croppedImage,(x1, y1, x2, y2)))
#Get 2x more patches in the center of the image
if (useVaraibleStride):
varStride = setVarStride(x2, y2, maxX, maxY, stride, strideDivisor, imgRegion, 'x')
#Rows
x1 += varStride
x2 += varStride
#--DEBUG
#iX += 1
#print("Row_{4} -> x1: {0}, y1: {1}, x2: {2}, y2: {3}".format(x1, y1, x2, y2,iX))
#Get 2x more patches in the center of the image
if (useVaraibleStride):
varStride = setVarStride(x2, y2, maxX, maxY, stride, strideDivisor, imgRegion, 'y')
#Columns
x1 = 0
x2 = patchWidth
y1 += varStride
y2 += varStride
#--DEBUG
#iY += 1
#print(" Column_{4} -> x1: {0}, y1: {1}, x2: {2}, y2: {3}".format(x1, y1, x2, y2, iY))
#Get patches at edge of image
x1 = 0
x2 = patchWidth
y1 = maxY - patchHeight
y2 = maxY
#Bottom edge patches
while x2 <= (maxX):
#--DEBUG
#iX += 1
#print("Row_{4} -> x1: {0}, y1: {1}, x2: {2}, y2: {3}".format(x1, y1, x2, y2,iX))
#--DEBUG
croppedImage = image.crop((x1,y1,x2,y2))
croppedImageList.append((croppedImage,(x1, y1, x2, y2)))
#Rows
x1 += stride
x2 += stride
#Right edge patches
x1 = maxX - patchWidth
x2 = maxX
y1 = 0
y2 = patchHeight
while y2 <= (maxY):
#--DEBUG
#iY += 1
#print(" Column_{4} -> x1: {0}, y1: {1}, x2: {2}, y2: {3}".format(x1, y1, x2, y2, iY))
#--DEBUG
croppedImage = image.crop((x1,y1,x2,y2))
croppedImageList.append((croppedImage,(x1, y1, x2, y2)))
#Columns
y1 += stride
y2 += stride
#--DEBUG
print("GetPatchesAndCoordByRow (Count={0}, W={1}, H={2}, Stride={3})".format(len(croppedImageList), int(patchWidth), int(patchHeight), int(stride)))
return croppedImageList
view_as_windows code
def CreatePatches(image, patchHeight, patchWidth, stride = 1):
imageArray = numpy.asarray(image)
patches = view_as_windows(imageArray, (patchHeight, patchWidth), stride)
print("Raw Patches initial shape: {0}".format(patches.shape))
return patches
我不认为你能比 view_as_windows
做得更好,因为只要输入数组是连续的,它就已经非常高效了。我怀疑即使是 cythonizing 也会有很大的不同。我研究了它的实现,实际上有点印象:
一个 numpy 数组由一个底层数据数组(例如 char *
)和一个 "strides" 数组组成,每个维度一个,它告诉沿着底层移动多远数组,对于沿该维度的每一步。 The implementation of view_as_windows
通过创建一个与其输入共享相同数据数组的新数组来利用这一点,并简单地插入新的 "strides" 以添加可用于 select 补丁的维度。这意味着它不会像您所说的那样返回 "an array of all the patches",但它只返回一个数组,其第一个维度就像补丁数组中的索引一样。
因此,view_as_windows
不需要复制图像中的任何数据来创建补丁,也不需要为每个补丁创建额外的 ndarray 对象。它需要复制数据的唯一时间是当它的输入数组不连续时(例如,它是一个更大数组的一部分)。即使使用 Cython,我也看不出你能比这做得更好。
在您的实施中,即使假设 image.crop
能够共享图像中的数据,您仍然在创建一个看起来像 1199x799 不同 image
对象的数组。
您是否确认 view_as_windows
是您的算法花费大部分时间的地方?