将 RGBD 保存为单个图像
Saving RGBD as single image
我使用此代码 https://www.programmersought.com/article/8773686326/ 通过整合 RGB 和深度图像来创建 RGBD
现在我想知道那个 RGBD 文件是否可以保存为 单个 图像 (jpeg,png...)
我试过了,但是没有成功,通过使用 imageio.imwrite(), plt.imsave(), cv2.imwrite()... 可能是维度 [4,64,1216],所以有没有如何实现它?
scale = (64, 1216)
resize_img = transforms.Resize(scale, Image.BILINEAR)
resize_depth = transforms.Resize(scale, Image.NEAREST)
to_tensor = transforms.ToTensor()
img_id = 0
# load image and resize
img = Image.open('RGB_image.jpg')
img = resize_img(img)
img = np.array(img)
# load depth and resize
depth = Image.open('depth_image.png')
depth = resize_depth(depth)
depth = np.array(depth)
depth = depth[:, :, np.newaxis]
# tensor shape and value, normalization
img = Image.fromarray(img).convert('RGB')
img = to_tensor(img).float()
depth = depth / 65535
depth = to_tensor(depth).float()
rgbd = torch.cat((img, depth), 0)
print("\n\nRGBD shape")
print(rgbd.shape)
我们可以将深度保存为 RGBA 像素格式的图像的 alpha 通道。
Alpha通道应用透明通道,但我们可以将其用作第4个通道来存储RGB和深度。
由于深度可能需要高精度-可能需要float32
精度,我建议使用OpenEXR图像格式。
为了与 OpenEXR 格式兼容,我们可以将范围 [0, 1].
中的所有通道转换为 float32
注:
- 我意识到Open3D支持RGBD图像,但它似乎不支持将RGB和深度读取和写入单个文件。
以下代码示例使用 OpenCV 而不是 Pillow。
我以为 OpenCV 支持 EXR 文件格式,但我的 OpenCV Python 版本不支持 EXR。我改用 ImageIO 包。
将 RGB 和深度转换并写入 EXR 文件的阶段:
加载 RGB 图像,调整其大小并转换为浮点数:
img = cv2.imread('RGB_image.jpg') # The channels order is BGR due to OpenCV conventions.
img = cv2.resize(img, scale, interpolation=cv2.INTER_LINEAR)
img = img.astype(np.float32) / 255 # Convert to float in range [0, 1]
加载深度图像,调整大小并转换为浮点数:
depth = cv2.imread('depth_image.png', cv2.IMREAD_UNCHANGED) # Assume depth_image.png is 16 bits grayscale.
depth = cv2.resize(depth, scale, interpolation=cv2.INTER_NEAREST)
depth = depth.astype(np.float32) / 65535 # Convert to float in range [0, 1]
将img
(3通道)和depth
(1通道)合并为4通道:
形状将是 (1216, 64, 4)
(应用 OpenCV BGRA 颜色约定)。
bgrd = np.dstack((img, depth))
正在将 bgrd
写入 EXR 文件:
如果 OpenCV 是用 OpenEXR 构建的,我们可以使用:cv2.imwrite('rgbd.exr', bgrd)
.
如果我们使用 ImageIO,我们最好在保存之前从 BGRA 转换为 RGBA:
rgbd = cv2.cvtColor(bgrd, cv2.COLOR_BGRA2RGBA)
imageio.imwrite('rgbd.exr', rgbd)
代码示例(将 RGB 和范围转换为 RGBA EXR 文件,然后读取并转换回):
import numpy as np
import cv2
import imageio
scale = (64, 1216)
# load image and resize
img = cv2.imread('RGB_image.jpg') # The channels order is BGR due to OpenCV conventions.
img = cv2.resize(img, scale, interpolation=cv2.INTER_LINEAR)
img = img.astype(np.float32) / 255 # Convert to float in range [0, 1]
# load depth and resize
depth = cv2.imread('depth_image.png', cv2.IMREAD_UNCHANGED) # Assume depth_image.png is 16 bits grayscale.
depth = cv2.resize(depth, scale, interpolation=cv2.INTER_NEAREST)
if depth.ndim == 3:
depth = depth[:, :, 0] # Keep one channel if depth has 3 channels? depth = depth[:, :, np.newaxis]
depth = depth.astype(np.float32) / 65535 # Convert to float in range [0, 1]
# Use the depth channel as alpha channel (the channel order is BGRA - applies OpenCV conventions).
bgrd = np.dstack((img, depth))
print("\n\nRGBD shape")
print(bgrd.shape)
# Save the data to exr file (the color format of the exr file is RGBA).
# Error: cv::initOpenEXR imgcodecs: OpenEXR codec is disabled.
#cv2.imwrite('rgbd.exr', bgrd)
#
rgbd = cv2.cvtColor(bgrd, cv2.COLOR_BGRA2RGBA)
imageio.imwrite('rgbd.exr', rgbd)
################################################################################
# Reading the data:
#bgrd = cv2.imread('rgbd.exr') # Error: cv::initOpenEXR imgcodecs: OpenEXR codec is disabled.
rgbd = imageio.imread('rgbd.exr')
img = bgrd[:, :, 0:3] # First 3 channels are the image.
depth = bgrd[:, :, 3] # Last channel is the depth
img = (img*255).astype(np.uint8) # Convert back to uint8
#depth = (depth*65535).astype(np.uint16) # Convert back to uint16 (if required).
# Show images for testing:
cv2.imshow('img', cv2.cvtColor(img, cv2.COLOR_RGBA2RGB))
cv2.imshow('depth', depth)
cv2.waitKey()
cv2.destroyAllWindows()
注:
- 您可能需要做一些修改 - 我不确定尺寸(
64x1216
或 1216x64
),也不确定代码 depth = depth[:, :, np.newaxis]
。
我可能对 depth_image.png
. 的格式有误
更新:
将 16 位 RGBA 保存到 PNG 文件:
而不是使用 EXR 文件和 float32
像素格式...
我们可能会使用 PNG 文件和 uint16
像素格式。
PNG 文件的像素格式将为 RGBA(RGB 和 Alpha - 透明通道)。
每个颜色通道将是 16 位(2 字节)。
Alpha 通道存储深度图(采用 uint16
格式)。
将img
转换为uint16
(我们可以选择不缩放256):
img = img.astype(np.uint16)*256
将img
(3通道)和depth
(1通道)合并为4通道:
bgrd = np.dstack((img, depth))
将合并后的图像保存为 PNG 文件:
cv2.imwrite('rgbd.png', bgrd)
代码示例(第二部分读取和显示测试):
import numpy as np
import cv2
scale = (64, 1216)
# load image and resize
img = cv2.imread('RGB_image.jpg') # The channels order is BGR due to OpenCV conventions.
img = cv2.resize(img, scale, interpolation=cv2.INTER_LINEAR)
# Convert the image to from 8 bits per color channel to 16 bits per color channel
# Notes:
# 1. We may choose not to scale by 256, the scaling is used only for viewers that expects [0, 65535] range.
# 2. Consider that most image viewers refers the alpha (transparency) channel, so image is going to look strange.
img = img.astype(np.uint16)*256
# load depth and resize
depth = cv2.imread('depth_image.png', cv2.IMREAD_UNCHANGED) # Assume depth_image.png is 16 bits grayscale.
depth = cv2.resize(depth, scale, interpolation=cv2.INTER_NEAREST)
if depth.ndim == 3:
depth = depth[:, :, 0] # Keep one channel if depth has 3 channels? depth = depth[:, :, np.newaxis]
if depth.dtype != np.uint16:
depth = depth.astype(np.uint16) # The depth supposed to be uint16, so code should not reach here.
# Use the depth channel as alpha channel (the channel order is BGRA - applies OpenCV conventions).
bgrd = np.dstack((img, depth))
print("\n\nRGBD shape")
print(bgrd.shape) # (1216, 64, 4)
# Save the data to PNG file (the pixel format of the PNG file is 16 bits RGBA).
cv2.imwrite('rgbd.png', bgrd)
# Testing:
################################################################################
# Reading the data:
bgrd = cv2.imread('rgbd.png', cv2.IMREAD_UNCHANGED)
img = bgrd[:, :, 0:3] # First 3 channels are the image.
depth = bgrd[:, :, 3] # Last channel is the depth
#img = (img // 256).astype(np.uint8) # Convert back to uint8
# Show images for testing:
cv2.imshow('img', img)
cv2.imshow('depth', depth)
cv2.waitKey()
cv2.destroyAllWindows()
我使用此代码 https://www.programmersought.com/article/8773686326/ 通过整合 RGB 和深度图像来创建 RGBD 现在我想知道那个 RGBD 文件是否可以保存为 单个 图像 (jpeg,png...) 我试过了,但是没有成功,通过使用 imageio.imwrite(), plt.imsave(), cv2.imwrite()... 可能是维度 [4,64,1216],所以有没有如何实现它?
scale = (64, 1216)
resize_img = transforms.Resize(scale, Image.BILINEAR)
resize_depth = transforms.Resize(scale, Image.NEAREST)
to_tensor = transforms.ToTensor()
img_id = 0
# load image and resize
img = Image.open('RGB_image.jpg')
img = resize_img(img)
img = np.array(img)
# load depth and resize
depth = Image.open('depth_image.png')
depth = resize_depth(depth)
depth = np.array(depth)
depth = depth[:, :, np.newaxis]
# tensor shape and value, normalization
img = Image.fromarray(img).convert('RGB')
img = to_tensor(img).float()
depth = depth / 65535
depth = to_tensor(depth).float()
rgbd = torch.cat((img, depth), 0)
print("\n\nRGBD shape")
print(rgbd.shape)
我们可以将深度保存为 RGBA 像素格式的图像的 alpha 通道。
Alpha通道应用透明通道,但我们可以将其用作第4个通道来存储RGB和深度。
由于深度可能需要高精度-可能需要float32
精度,我建议使用OpenEXR图像格式。
为了与 OpenEXR 格式兼容,我们可以将范围 [0, 1].
float32
注:
- 我意识到Open3D支持RGBD图像,但它似乎不支持将RGB和深度读取和写入单个文件。
以下代码示例使用 OpenCV 而不是 Pillow。
我以为 OpenCV 支持 EXR 文件格式,但我的 OpenCV Python 版本不支持 EXR。我改用 ImageIO 包。
将 RGB 和深度转换并写入 EXR 文件的阶段:
加载 RGB 图像,调整其大小并转换为浮点数:
img = cv2.imread('RGB_image.jpg') # The channels order is BGR due to OpenCV conventions. img = cv2.resize(img, scale, interpolation=cv2.INTER_LINEAR) img = img.astype(np.float32) / 255 # Convert to float in range [0, 1]
加载深度图像,调整大小并转换为浮点数:
depth = cv2.imread('depth_image.png', cv2.IMREAD_UNCHANGED) # Assume depth_image.png is 16 bits grayscale. depth = cv2.resize(depth, scale, interpolation=cv2.INTER_NEAREST) depth = depth.astype(np.float32) / 65535 # Convert to float in range [0, 1]
将
img
(3通道)和depth
(1通道)合并为4通道:
形状将是(1216, 64, 4)
(应用 OpenCV BGRA 颜色约定)。bgrd = np.dstack((img, depth))
正在将
bgrd
写入 EXR 文件:
如果 OpenCV 是用 OpenEXR 构建的,我们可以使用:cv2.imwrite('rgbd.exr', bgrd)
.
如果我们使用 ImageIO,我们最好在保存之前从 BGRA 转换为 RGBA:rgbd = cv2.cvtColor(bgrd, cv2.COLOR_BGRA2RGBA) imageio.imwrite('rgbd.exr', rgbd)
代码示例(将 RGB 和范围转换为 RGBA EXR 文件,然后读取并转换回):
import numpy as np
import cv2
import imageio
scale = (64, 1216)
# load image and resize
img = cv2.imread('RGB_image.jpg') # The channels order is BGR due to OpenCV conventions.
img = cv2.resize(img, scale, interpolation=cv2.INTER_LINEAR)
img = img.astype(np.float32) / 255 # Convert to float in range [0, 1]
# load depth and resize
depth = cv2.imread('depth_image.png', cv2.IMREAD_UNCHANGED) # Assume depth_image.png is 16 bits grayscale.
depth = cv2.resize(depth, scale, interpolation=cv2.INTER_NEAREST)
if depth.ndim == 3:
depth = depth[:, :, 0] # Keep one channel if depth has 3 channels? depth = depth[:, :, np.newaxis]
depth = depth.astype(np.float32) / 65535 # Convert to float in range [0, 1]
# Use the depth channel as alpha channel (the channel order is BGRA - applies OpenCV conventions).
bgrd = np.dstack((img, depth))
print("\n\nRGBD shape")
print(bgrd.shape)
# Save the data to exr file (the color format of the exr file is RGBA).
# Error: cv::initOpenEXR imgcodecs: OpenEXR codec is disabled.
#cv2.imwrite('rgbd.exr', bgrd)
#
rgbd = cv2.cvtColor(bgrd, cv2.COLOR_BGRA2RGBA)
imageio.imwrite('rgbd.exr', rgbd)
################################################################################
# Reading the data:
#bgrd = cv2.imread('rgbd.exr') # Error: cv::initOpenEXR imgcodecs: OpenEXR codec is disabled.
rgbd = imageio.imread('rgbd.exr')
img = bgrd[:, :, 0:3] # First 3 channels are the image.
depth = bgrd[:, :, 3] # Last channel is the depth
img = (img*255).astype(np.uint8) # Convert back to uint8
#depth = (depth*65535).astype(np.uint16) # Convert back to uint16 (if required).
# Show images for testing:
cv2.imshow('img', cv2.cvtColor(img, cv2.COLOR_RGBA2RGB))
cv2.imshow('depth', depth)
cv2.waitKey()
cv2.destroyAllWindows()
注:
- 您可能需要做一些修改 - 我不确定尺寸(
64x1216
或1216x64
),也不确定代码depth = depth[:, :, np.newaxis]
。
我可能对depth_image.png
. 的格式有误
更新:
将 16 位 RGBA 保存到 PNG 文件:
而不是使用 EXR 文件和 float32
像素格式...
我们可能会使用 PNG 文件和 uint16
像素格式。
PNG 文件的像素格式将为 RGBA(RGB 和 Alpha - 透明通道)。
每个颜色通道将是 16 位(2 字节)。
Alpha 通道存储深度图(采用 uint16
格式)。
将
img
转换为uint16
(我们可以选择不缩放256):img = img.astype(np.uint16)*256
将
img
(3通道)和depth
(1通道)合并为4通道:bgrd = np.dstack((img, depth))
将合并后的图像保存为 PNG 文件:
cv2.imwrite('rgbd.png', bgrd)
代码示例(第二部分读取和显示测试):
import numpy as np
import cv2
scale = (64, 1216)
# load image and resize
img = cv2.imread('RGB_image.jpg') # The channels order is BGR due to OpenCV conventions.
img = cv2.resize(img, scale, interpolation=cv2.INTER_LINEAR)
# Convert the image to from 8 bits per color channel to 16 bits per color channel
# Notes:
# 1. We may choose not to scale by 256, the scaling is used only for viewers that expects [0, 65535] range.
# 2. Consider that most image viewers refers the alpha (transparency) channel, so image is going to look strange.
img = img.astype(np.uint16)*256
# load depth and resize
depth = cv2.imread('depth_image.png', cv2.IMREAD_UNCHANGED) # Assume depth_image.png is 16 bits grayscale.
depth = cv2.resize(depth, scale, interpolation=cv2.INTER_NEAREST)
if depth.ndim == 3:
depth = depth[:, :, 0] # Keep one channel if depth has 3 channels? depth = depth[:, :, np.newaxis]
if depth.dtype != np.uint16:
depth = depth.astype(np.uint16) # The depth supposed to be uint16, so code should not reach here.
# Use the depth channel as alpha channel (the channel order is BGRA - applies OpenCV conventions).
bgrd = np.dstack((img, depth))
print("\n\nRGBD shape")
print(bgrd.shape) # (1216, 64, 4)
# Save the data to PNG file (the pixel format of the PNG file is 16 bits RGBA).
cv2.imwrite('rgbd.png', bgrd)
# Testing:
################################################################################
# Reading the data:
bgrd = cv2.imread('rgbd.png', cv2.IMREAD_UNCHANGED)
img = bgrd[:, :, 0:3] # First 3 channels are the image.
depth = bgrd[:, :, 3] # Last channel is the depth
#img = (img // 256).astype(np.uint8) # Convert back to uint8
# Show images for testing:
cv2.imshow('img', img)
cv2.imshow('depth', depth)
cv2.waitKey()
cv2.destroyAllWindows()