如何在 Python 文档字符串测试中处理舍入为负零
How to handle rounding to negative zero in Python docstring tests
与此问题相关:How to have negative zero always formatted as positive zero in a python string?
我有以下使用 Numpy 实现 Matlab orth.m 的函数。我有一个文档字符串测试,它依赖于 np.array2string 使用 suppress_small=True,这将使小值四舍五入为零。但是,有时它们四舍五入为正零,有时四舍五入为负零,这取决于答案是 1e-16 还是 -1e-17 或类似结果。发生哪种情况基于 SVD 分解,并且可能因平台而异或跨 Python 版本,具体取决于使用的底层线性代数求解器(BLAS、Lapack 等)
设计文档字符串测试以解决此问题的最佳方法是什么?
在最后的doctest中,有时Q[0, 1]项是-0。有时它是 0.
import doctest
import numpy as np
def orth(A):
r"""
Orthogonalization basis for the range of A.
That is, Q.T @ Q = I, the columns of Q span the same space as the columns of A, and the number
of columns of Q is the rank of A.
Parameters
----------
A : 2D ndarray
Input matrix
Returns
-------
Q : 2D ndarray
Orthogonalization matrix of A
Notes
-----
#. Based on the Matlab orth.m function.
Examples
--------
>>> import numpy as np
Full rank matrix
>>> A = np.array([[1, 0, 1], [-1, -2, 0], [0, 1, -1]])
>>> r = np.linalg.matrix_rank(A)
>>> print(r)
3
>>> Q = orth(A)
>>> with np.printoptions(precision=8):
... print(Q)
[[-0.12000026 -0.80971228 0.57442663]
[ 0.90175265 0.15312282 0.40422217]
[-0.41526149 0.5664975 0.71178541]]
Rank deficient matrix
>>> A = np.array([[1, 0, 1], [0, 1, 0], [1, 0, 1]])
>>> r = np.linalg.matrix_rank(A)
>>> print(r)
2
>>> Q = orth(A)
>>> print(np.array2string(Q, precision=8, suppress_small=True)) # Sometimes this fails
[[-0.70710678 -0. ]
[ 0. 1. ]
[-0.70710678 0. ]]
"""
# compute the SVD
(Q, S, _) = np.linalg.svd(A, full_matrices=False)
# calculate a tolerance based on the first eigenvalue (instead of just using a small number)
tol = np.max(A.shape) * S[0] * np.finfo(float).eps
# sum the number of eigenvalues that are greater than the calculated tolerance
r = np.sum(S > tol, axis=0)
# return the columns corresponding to the non-zero eigenvalues
Q = Q[:, np.arange(r)]
return Q
if __name__ == '__main__':
doctest.testmod(verbose=False)
您可以打印 rounded 数组加上 0.0
以消除 -0
:
A = np.array([[1, 0, 1], [0, 1, 0], [1, 0, 1]])
Q = orth(A)
Q[0,1] = -1e-16 # simulate a small floating point deviation
print(np.array2string(Q.round(8)+0.0, precision=8, suppress_small=True))
#[[-0.70710678 0. ]
# [ 0. 1. ]
# [-0.70710678 0. ]]
所以你的文档字符串应该是:
>>> Q = orth(A)
>>> print(np.array2string(Q.round(8)+0.0, precision=8, suppress_small=True)) # guarantee non-negative zeros
[[-0.70710678 0. ]
[ 0. 1. ]
[-0.70710678 0. ]]
这是我想出的另一种选择,尽管我认为我更喜欢将数组四舍五入到给定的精度。在此方法中,您将整个数组移动某个大于舍入误差但小于精度比较的量。这样,小数字仍然总是略微为正。
>>> Q = orth(A)
>>> print(np.array2string(Q + np.full(Q.shape, 1e-14), precision=8, suppress_small=True))
[[-0.70710678 0. ]
[ 0. 1. ]
[-0.70710678 0. ]]
与此问题相关:How to have negative zero always formatted as positive zero in a python string?
我有以下使用 Numpy 实现 Matlab orth.m 的函数。我有一个文档字符串测试,它依赖于 np.array2string 使用 suppress_small=True,这将使小值四舍五入为零。但是,有时它们四舍五入为正零,有时四舍五入为负零,这取决于答案是 1e-16 还是 -1e-17 或类似结果。发生哪种情况基于 SVD 分解,并且可能因平台而异或跨 Python 版本,具体取决于使用的底层线性代数求解器(BLAS、Lapack 等)
设计文档字符串测试以解决此问题的最佳方法是什么?
在最后的doctest中,有时Q[0, 1]项是-0。有时它是 0.
import doctest
import numpy as np
def orth(A):
r"""
Orthogonalization basis for the range of A.
That is, Q.T @ Q = I, the columns of Q span the same space as the columns of A, and the number
of columns of Q is the rank of A.
Parameters
----------
A : 2D ndarray
Input matrix
Returns
-------
Q : 2D ndarray
Orthogonalization matrix of A
Notes
-----
#. Based on the Matlab orth.m function.
Examples
--------
>>> import numpy as np
Full rank matrix
>>> A = np.array([[1, 0, 1], [-1, -2, 0], [0, 1, -1]])
>>> r = np.linalg.matrix_rank(A)
>>> print(r)
3
>>> Q = orth(A)
>>> with np.printoptions(precision=8):
... print(Q)
[[-0.12000026 -0.80971228 0.57442663]
[ 0.90175265 0.15312282 0.40422217]
[-0.41526149 0.5664975 0.71178541]]
Rank deficient matrix
>>> A = np.array([[1, 0, 1], [0, 1, 0], [1, 0, 1]])
>>> r = np.linalg.matrix_rank(A)
>>> print(r)
2
>>> Q = orth(A)
>>> print(np.array2string(Q, precision=8, suppress_small=True)) # Sometimes this fails
[[-0.70710678 -0. ]
[ 0. 1. ]
[-0.70710678 0. ]]
"""
# compute the SVD
(Q, S, _) = np.linalg.svd(A, full_matrices=False)
# calculate a tolerance based on the first eigenvalue (instead of just using a small number)
tol = np.max(A.shape) * S[0] * np.finfo(float).eps
# sum the number of eigenvalues that are greater than the calculated tolerance
r = np.sum(S > tol, axis=0)
# return the columns corresponding to the non-zero eigenvalues
Q = Q[:, np.arange(r)]
return Q
if __name__ == '__main__':
doctest.testmod(verbose=False)
您可以打印 rounded 数组加上 0.0
以消除 -0
:
A = np.array([[1, 0, 1], [0, 1, 0], [1, 0, 1]])
Q = orth(A)
Q[0,1] = -1e-16 # simulate a small floating point deviation
print(np.array2string(Q.round(8)+0.0, precision=8, suppress_small=True))
#[[-0.70710678 0. ]
# [ 0. 1. ]
# [-0.70710678 0. ]]
所以你的文档字符串应该是:
>>> Q = orth(A)
>>> print(np.array2string(Q.round(8)+0.0, precision=8, suppress_small=True)) # guarantee non-negative zeros
[[-0.70710678 0. ]
[ 0. 1. ]
[-0.70710678 0. ]]
这是我想出的另一种选择,尽管我认为我更喜欢将数组四舍五入到给定的精度。在此方法中,您将整个数组移动某个大于舍入误差但小于精度比较的量。这样,小数字仍然总是略微为正。
>>> Q = orth(A)
>>> print(np.array2string(Q + np.full(Q.shape, 1e-14), precision=8, suppress_small=True))
[[-0.70710678 0. ]
[ 0. 1. ]
[-0.70710678 0. ]]