将 3D 32 位浮点数组保存为 Python 中的 48 位整数 PNG 以匹配 Kitti Ground Truth 格式
Saving a 3D 32bit floatarray to a 48bit integer PNG in Python to match Kitti Ground Truth Format
Kitti 有一个光流基准。他们要求流量估计为 48 位 PNG 文件,以匹配他们拥有的地面实况文件的格式。
Ground Truth PNG 图像可用于 download here
Kitti 有一个 Matlab DevKit 用于估计与地面实况比较。
我想将网络中的流量输出为 48 位整数 PNG 文件,以便可以将我的流量估算值与其他 Kitti 基准流量估算值进行比较。
来自网络的numpy缩放流文件为downloadable from here
但是,我在 python 中无法将 float32 3D 数组流转换为 3 通道 48 位文件(每个通道 16 位),因为图像库提供商似乎不支持此功能,或者因为我的代码有问题。有人可以帮忙吗?
我尝试了很多不同的库并阅读了很多帖子。
不幸的是,Scipy 输出一个只有 24 位的 png。
使用 scipy available here
生成的输出流量估计 png
# Numpy Flow to 48bit PNG with 16bits per channel
import scipy as sp
from scipy import misc
import numpy as np
import png
import imageio
import cv2
from PIL import Image
from matplotlib import image
"""From Kitti DevKit:-
Optical flow maps are saved as 3-channel uint16 PNG images: The first
channel
contains the u-component, the second channel the v-component and the
third
channel denotes if the pixel is valid or not (1 if true, 0 otherwise). To
convert
the u-/v-flow into floating point values, convert the value to float,
subtract 2^15 and divide the result by 64.0:"""
Scaled_Flow = np.load('Scaled_Flow.npy') # This is a 32bit float
# This is the very first Kitti Test Flow Output from image_2 testing folder
# passed through DVF
# The network that produced this flow is only trained to 51 steps, so it
# won't provide an accurate correspondence
# But the Estimated Flow PNG should look green
ones = np.float32(np.ones((2,375,1242,1))) # Kitti devkit readme says
that third channel is 1 if flow is valid for that pixel
# 2 for batch size, 3 for height, 3 for width, 1 for this extra layer of
ones.
with_ones = np.concatenate((Scaled_Flow, ones), axis=3)
im = sp.misc.toimage(with_ones[-1,:,:,:], cmin=-1.0, cmax=1.0) # saves image object
im.save("Scipy_24bit.png", dtype="uint48") # Outputs 24bit only.
Flow = np.int16(with_ones) # An attempt at converting the format from
float 32 to 16 bit integers
f512 = Flow * 512 # Kitti instructs that the flows are scaled by 512.
x = np.array(Scaled_Flow)
x.astype(np.uint16) # another attempt at converting it to unsigned 16 bit
integers
try: # try PyPNG
with open('PyPNGuint48bit.png', 'wb') as f:
writer = png.Writer(width=375, height=1242, bitdepth=16)
# Convert z to the Python list of lists expected by
# the png writer.
#z2list = x.reshape(-1, x.shape[1]*x.shape[2]).tolist()
writer.write(f, x)
except:
print("png lib approach didn't work, it might be to do with the
sizing")
try: # try imageio
imageio.imwrite('imageio_Flow_48bit.png', x, format='PNG-FI')
except:
print("imageio approach didn't work, it probably couldn't handle the
datatype")
try: # try OpenCV
cv2.imwrite('OpenCVFlow_48bit_.png',x )
except:
print("OpenCV approach didn't work, it probably couldn't handle the
datatype")
try: #try: # try PIL
im = Image.fromarray(x)
im.save("PILLOW_Flow_48bit.png", format="PNG")
except:
print("PILLOW approach didn't work, it probably couldn't handle the
datatype")
try: # try Matplotlib
image.imsave('MatplotLib_Flow_48bit.png', x)
except:
print("Matplotlib approach didn't work, ValueError: object too deep
for desired array")'''
我想得到一个和Kitti Ground truth一样的48位png文件,那个
看起来是绿色的。当前 Scipy 输出一个 24 位的蓝色 png 文件,
看起来很白。
这是我对你想做的事情的理解:
- 从
Scaled_Flow.npy
加载数据。这是一个形状为 (2, 375, 1242, 2) 的 32 位浮点 numpy 数组。
通过以下方式将 Scaled_Flow[1]
(形状为 (375, 1242, 2) 的数组)转换为 16 位无符号整数:
- 乘以 64,
- 添加
2**15
和
- 将值转换为
np.uint16
。
这与您引用的描述相反:"To convert the u-/v-flow into floating point values, convert the value to float, subtract 2^15 and divide the result by 64.0".
- 通过连接全 1 的数组,将第三维的长度从 2 增加到 3。
- 将结果保存到 PNG 文件。
这是您可以执行此操作的一种方法。要创建 PNG 文件,我将使用 numpngw
,这是我编写的用于从 numpy 数组创建 PNG 和动画 PNG 文件的库。如果您给 numpngw.write_png
一个数据类型为 np.uint16
的 numpy 数组,它将创建一个每个通道 16 位的 PNG 文件(即在本例中为 48 位图像)。
import numpy as np
from numpngw import write_png
Scaled_Flow = np.load('Scaled_Flow.npy')
sf16 = (64*Scaled_Flow[-1] + 2**15).astype(np.uint16)
imgdata = np.concatenate((sf16, np.ones(sf16.shape[:2] + (1,), dtype=sf16.dtype)), axis=2)
write_png('sf48.png', imgdata)
这是由该脚本创建的图像。
Kitti 有一个光流基准。他们要求流量估计为 48 位 PNG 文件,以匹配他们拥有的地面实况文件的格式。
Ground Truth PNG 图像可用于 download here
Kitti 有一个 Matlab DevKit 用于估计与地面实况比较。
我想将网络中的流量输出为 48 位整数 PNG 文件,以便可以将我的流量估算值与其他 Kitti 基准流量估算值进行比较。
来自网络的numpy缩放流文件为downloadable from here
但是,我在 python 中无法将 float32 3D 数组流转换为 3 通道 48 位文件(每个通道 16 位),因为图像库提供商似乎不支持此功能,或者因为我的代码有问题。有人可以帮忙吗?
我尝试了很多不同的库并阅读了很多帖子。
不幸的是,Scipy 输出一个只有 24 位的 png。 使用 scipy available here
生成的输出流量估计 png# Numpy Flow to 48bit PNG with 16bits per channel
import scipy as sp
from scipy import misc
import numpy as np
import png
import imageio
import cv2
from PIL import Image
from matplotlib import image
"""From Kitti DevKit:-
Optical flow maps are saved as 3-channel uint16 PNG images: The first
channel
contains the u-component, the second channel the v-component and the
third
channel denotes if the pixel is valid or not (1 if true, 0 otherwise). To
convert
the u-/v-flow into floating point values, convert the value to float,
subtract 2^15 and divide the result by 64.0:"""
Scaled_Flow = np.load('Scaled_Flow.npy') # This is a 32bit float
# This is the very first Kitti Test Flow Output from image_2 testing folder
# passed through DVF
# The network that produced this flow is only trained to 51 steps, so it
# won't provide an accurate correspondence
# But the Estimated Flow PNG should look green
ones = np.float32(np.ones((2,375,1242,1))) # Kitti devkit readme says
that third channel is 1 if flow is valid for that pixel
# 2 for batch size, 3 for height, 3 for width, 1 for this extra layer of
ones.
with_ones = np.concatenate((Scaled_Flow, ones), axis=3)
im = sp.misc.toimage(with_ones[-1,:,:,:], cmin=-1.0, cmax=1.0) # saves image object
im.save("Scipy_24bit.png", dtype="uint48") # Outputs 24bit only.
Flow = np.int16(with_ones) # An attempt at converting the format from
float 32 to 16 bit integers
f512 = Flow * 512 # Kitti instructs that the flows are scaled by 512.
x = np.array(Scaled_Flow)
x.astype(np.uint16) # another attempt at converting it to unsigned 16 bit
integers
try: # try PyPNG
with open('PyPNGuint48bit.png', 'wb') as f:
writer = png.Writer(width=375, height=1242, bitdepth=16)
# Convert z to the Python list of lists expected by
# the png writer.
#z2list = x.reshape(-1, x.shape[1]*x.shape[2]).tolist()
writer.write(f, x)
except:
print("png lib approach didn't work, it might be to do with the
sizing")
try: # try imageio
imageio.imwrite('imageio_Flow_48bit.png', x, format='PNG-FI')
except:
print("imageio approach didn't work, it probably couldn't handle the
datatype")
try: # try OpenCV
cv2.imwrite('OpenCVFlow_48bit_.png',x )
except:
print("OpenCV approach didn't work, it probably couldn't handle the
datatype")
try: #try: # try PIL
im = Image.fromarray(x)
im.save("PILLOW_Flow_48bit.png", format="PNG")
except:
print("PILLOW approach didn't work, it probably couldn't handle the
datatype")
try: # try Matplotlib
image.imsave('MatplotLib_Flow_48bit.png', x)
except:
print("Matplotlib approach didn't work, ValueError: object too deep
for desired array")'''
我想得到一个和Kitti Ground truth一样的48位png文件,那个 看起来是绿色的。当前 Scipy 输出一个 24 位的蓝色 png 文件, 看起来很白。
这是我对你想做的事情的理解:
- 从
Scaled_Flow.npy
加载数据。这是一个形状为 (2, 375, 1242, 2) 的 32 位浮点 numpy 数组。 通过以下方式将
Scaled_Flow[1]
(形状为 (375, 1242, 2) 的数组)转换为 16 位无符号整数:- 乘以 64,
- 添加
2**15
和 - 将值转换为
np.uint16
。
这与您引用的描述相反:"To convert the u-/v-flow into floating point values, convert the value to float, subtract 2^15 and divide the result by 64.0".
- 通过连接全 1 的数组,将第三维的长度从 2 增加到 3。
- 将结果保存到 PNG 文件。
这是您可以执行此操作的一种方法。要创建 PNG 文件,我将使用 numpngw
,这是我编写的用于从 numpy 数组创建 PNG 和动画 PNG 文件的库。如果您给 numpngw.write_png
一个数据类型为 np.uint16
的 numpy 数组,它将创建一个每个通道 16 位的 PNG 文件(即在本例中为 48 位图像)。
import numpy as np
from numpngw import write_png
Scaled_Flow = np.load('Scaled_Flow.npy')
sf16 = (64*Scaled_Flow[-1] + 2**15).astype(np.uint16)
imgdata = np.concatenate((sf16, np.ones(sf16.shape[:2] + (1,), dtype=sf16.dtype)), axis=2)
write_png('sf48.png', imgdata)
这是由该脚本创建的图像。