如何在创建 Numpy 数组时避免精度错误？

Question

我的代码必须达到一定的精度。但是，我认为由于总结了长数组中的错误，它会累积更大的错误。你能找出为什么元素 38260 不等于 5.6 吗？

import numpy as np

xmin = -377
xmax = 345
spacing = 0.01

a = np.arange(xmin,xmax,spacing)
print('{0:.25f}'.format(a[38260]))
# Prints 5.5999999996520273271016777
print(a[38260] == 5.6)
# Prints false

b = np.arange(xmin/spacing,xmax/spacing)*spacing
print('{0:.25f}'.format(b[38260]))
#Prints 5.6000000000000005329070518
print(b[38260] == 5.6)
#Prints false

Answer 1

您可以通过创建一个 np.int32 开始，然后将其划分为您的浮点值范围来避免错误累积：

import numpy as np

xmin = -37700  # times 100
xmax = 34500   # times 100
spacing = 1    # times 100

b = np.arange(xmin, xmax, spacing)  # np.int32 array - no loss of precision

a = b / 100.0  # divide by 100      # np.float64 array - closer to what you want

print('{0:.25f}'.format(a[38260]))
print(a[38260] == 5.6)

输出：

 5.6
 True

仍然建议使用 np.isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False) 而不是 == 进行相等比较。

您的原始代码已经在我的系统上使用了 np.float64，所以我看不出有什么办法可以使它更精确，而且众所周知浮点数使用起来很灵活，请参阅：

Is floating point math broken?
What is the best way to compare floats for almost-equality in Python?.

Answer 2

要创建数组，您可以使用 list，通过使用自定义 Floatrange class，在 decimal 模块的帮助下：

from decimal import Decimal

class FloatRange:
    def __init__(self, start, stop, step, include_right=False):
        self.start = Decimal(f'{start}')
        self.stop = Decimal(f'{stop}')
        self.step = Decimal(f'{step}')
        self.range = Decimal(f'{stop - start}')
        self.include_right = include_right
        self.len = len(self) + include_right
        self.max_index = self.len - 1
        self.count = 0
    def __getitem__(self, index):
        if index < 0:
            index = self.len + index
        if index > self.max_index:
            raise IndexError('FloatRange index out of range.')
        return float(self.start + index * self.step)
    def __len__(self):
        return int(self.range / self.step)
    def __next__(self):
        if self.count < self.len:
            self.count += 1
            return self[self.count-1]
        if include_endpoint:
            return stop
    def __iter__(self):
        while self.count < self.len:
            yield next(self)
    def to_numpy(self):
        return np.fromiter(self)
    def __repr__(self):
        return (f"FloatRange("
                    f"start={self.start}, " 
                    f"stop={self.stop}, " 
                    f"step={self.step}, "
                    f"include_right={self.include_right})")

然后你可以创建一个FloatRange对象：

>>> fr = FloatRange(xmin, xmax, spacing)
>>> fr
FloatRange(start=-377, stop=345, step=0.01, include_right=False)

>>> fr[-1]
344.99

>>> fr[38260]
5.6

>>> arr = fr.to_numpy() # Equivalent to: arr = np.array(list(fr))

>>> arr[38260]
5.6

如果include_right==True:

>>> fr = FloatRange(xmin, xmax, spacing, include_right=True)
>>> fr[-1]
345.0

如何在创建 Numpy 数组时避免精度错误？

How to Avoid Precision Errors While Creating Numpy Arrays?

python

precision

numpy