从二进制文件读取 numpy 数组作为 float16 而不是 float32 重塑输入

Question

我正在尝试读取形状为 804 x 600 的 .pfm 图像，我为此编写了一个这样的函数。我知道我的图像是 float16，但它们被读取为 float32。

def read_pfm(file):
    """Method to decode .pfm files and return data as numpy array"""
    f = open(file, "rb")

    # read information on number of channels and shape
    line1, line2, line3 = (f.readline() for _ in range(3))
    width, height = (int(s) for s in line2.split())

    # read data as big endian float
    data = np.fromfile(f,'>f') # TODO: data is read as float32. Why? Should be float16
    print(data.dtype)
    print(data.shape)
    data = np.reshape(data, shape)
    return data

我的问题有两个：

为什么我的图片是 float16 而默认读取为 float32？
当我以这种方式强制将图像读取为 float16 时

data = np.fromfile(f,'>f2')

输入的形状从 (482400,) 变为 (964800,)。为什么会这样？

编辑：我发现我弄错了，图像实际上是float32。然而，Daweo 的回答仍然澄清了我对 16-/32-bit 的困惑。

Answer 1

When I do force the images to be read as float16(...)the shape of input changes from (482400,) to (964800,). Why does this happen?

观察 482400 * 2 == 964800 和 32/16 == 2。考虑一个简单的例子，假设你有以下 8 位

01101110

当您被指示使用 8 位整数时，您会认为它是单个数字 (01101110)，但当指示您使用 4 位整数时，您会认为它是 2 个数字 (0110, 1110)，当指示使用 2 位整数时，您会认为它是 4 个数字 (01、10、11、10)。同样地，如果假设持有 float32 的给定字节序列确实包含 N 个数字，那么被视为持有 float16 的相同字节序列确实包含 N*(32/16) 即 N*2 个数字。

从二进制文件读取 numpy 数组作为 float16 而不是 float32 重塑输入

Reading numpy array from binary file as float16 instead of float32 reshapes the input

python

floating-point

numpy

binaryfiles