如何在不设置第 23 位的情况下在 python 中创建自定义 NaN（单精度）？

Question

我正在尝试通过选择分数位来创建浮点 NaN。但似乎 python float 在解释 NaN 时总是设置第 23 个小数位（IEEE754 单个）。

所以，我的问题是：是否可以在不设置第 23 位的情况下在 python 中定义一个 float nan？

（我正在使用 Python 2.7）

NaNs in IEEE 754 have this format:
sign = either 0 or 1.
biased exponent = all 1 bits.
fraction = anything except all 0 bits (since all 0 bits represents infinity).

因此，NaN 的十六进制表示可能是 0x7F800001，但是当将此 int 解释为 float 并将其解释回 int 时，会得到 0x7FC00001

第一次尝试：struct.pack/unpack:

import struct

def hex_to_float(value):
    return struct.unpack( '@f', struct.pack( '@L', value) )[0]

def float_to_hex(value):
    return struct.unpack( '@L', struct.pack( '@f', value) )[0]

print hex(float_to_hex(hex_to_float(0x7f800001)))
# 0x7fc00001

第二次尝试：ctypes

import ctypes

def float2hex(float_input):
    INTP = ctypes.POINTER(ctypes.c_uint)
    float_value = ctypes.c_float(float_input)
    my_pointer = ctypes.cast(ctypes.addressof(float_value), INTP)
    return my_pointer.contents.value

def hex2float(hex_input):
    FLOATP = ctypes.POINTER(ctypes.c_float)
    int_value = ctypes.c_uint(hex_input)
    my_pointer = ctypes.cast(ctypes.addressof(int_value), FLOATP)
    return my_pointer.contents.value

print hex(float2hex(hex2float(0x7f800001)))
# 0x7fc00001L

第三次尝试：xdrlib 加壳程序。结果相同。

Answer 1

你到底想做什么？

任何使用浮点数的 Python 代码在最好的情况下将忽略 "specially crafted" NaN，并在最坏的情况下崩溃。

如果您将此值传递给 Python 代码之外的东西 - 序列化或调用 C API，只需使用结构用您想要的确切字节定义它，并将这些字节发送到您想要的目的地。

此外，如果您使用的是 NumPy，那么，是的，您可以创建特殊的 NaN，然后期望在 ndarray 中重新存储 - 但实现此目的的方法也是通过使用 struct 指定您想要的确切字节，并以某种方式转换 data-type 同时保留缓冲区内容 .

检查这个关于构建 80 位双精度数以与 NumPy 一起使用的答案以获得解决方法：

（我在这里尝试了 numpy.frombuffer，它将您在那里制作的字节序列解释为 32 位，如果这适合您：

import numpy as np
import binascii
a = "7f800001"
b = binascii.unhexlify(a) # in Python 2 a.decode("hex") would work, but not Python3
# little endian format we need to revert the byte order
c = "".join(b[::-1])
x = np.frombuffer(c, dtype="float32")
x.tobytes()

将打印原件 -

'\x01\x00\x80\x7f'

并且检查数组 x 将显示它实际上是一个 NaN:

>>> x
array([nan], dtype=float32)

但是，由于上述原因，如果您使用 x[0] 从 numpy 数组中提取值，它将被转换为 "pasteurizd" float64 NaN，具有默认值。

Answer 2

潜在的问题是您将 C-float（具有 32 位）转换为 Python-float（具有 64 位，即 C-parlance 中的 double）并且比回到 C-float.

两个 cconversions 的执行并不总是导致原始输入 - 你正在见证这种情况。

如果确切的 bit-pattern 很重要，您应该不惜一切代价避免上述转换。

这里有一些血淋淋的细节：

所以当 struct.unpack('=f', some_bytes)（请注意，与您使用的原始大小（'@'）相比，我使用标准大小 =-格式字符，例如 @L 在 Windows 和 Linux) 上的含义不同，发生了以下情况：

unpack_float 被调用，它调用
_PyFloat_Unpack4, which interprets data (here or here) 作为
32bit-c-float，即float，
但将其转换为 double（因为函数 returns 是一个 `double'）while returning.

在 x86-64 上，最后一个转换意味着操作 VCVTSS2SD（即将标量 Single-Precision Floating-Point 值转换为标量 Double-Precision Floating-Point 值）此操作导致

0x7f800001 变成 0x7ff8000020000000.

如您所见，运算结果 struct.unpack( '=f', struct.pack( '=L', value) )[0] 已不是输入的结果。

然而，调用 struct.pack(=f, value) 以获得 python-float value（这是 C 的 double 的包装），将使我们_PyFloat_Pack4, where the conversion from double to float happens, i.e. CVTSD2SS (Convert Scalar Double-Precision Floating-Point Value to Scalar Single-Precision Floating-Point Value) 被调用并且

0x7ff8000020000000 变为 0x7fc00001.

如何在不设置第 23 位的情况下在 python 中创建自定义 NaN（单精度）？

How to create a custom NaN (single precision) in python without setting the 23rd bit?

python

struct

ctypes

nan

floating-point-conversion