如果参数大小大于 8192,为什么 numpy.sin return 会产生不同的结果?
Why does numpy.sin return a different result if the argument size is greater than 8192?
我发现 numpy.sin
在参数大小 <= 8192 和 > 8192 时表现不同。不同之处在于性能和返回值。谁能解释一下这个效果?
例如,让我们计算sin(pi/4):
x = np.pi*0.25
for n in range(8191, 8195):
xx = np.repeat(x, n)
%timeit np.sin(xx)
print(n, np.sin(xx)[0])
64.7 µs ± 194 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8191 0.7071067811865476
64.6 µs ± 166 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8192 0.7071067811865476
20.1 µs ± 189 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
8193 0.7071067811865475
21.8 µs ± 13.4 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8194 0.7071067811865475
超过 8192 个元素的限制后,计算速度提高了 3 倍以上,并给出了不同的结果:最后一位数字变为 5 而不是 6。
当我尝试以其他方式计算相同的值时,我得到:
- C++
std::sin
(Visual Studio 2017, Win32 平台) 给出 0.7071067811865475;
- C++
std::sin
(Visual Studio 2017,x64 平台)给出 0.70710678118654756;
math.sin
给出 0.7071067811865476,这是合乎逻辑的,因为我使用了 64 位 Python.
我在 NumPy 文档及其代码中找不到任何解释。
更新 #2:很难相信,但是用 sqrt
替换 sin
给出了这个:
44.2 µs ± 751 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8191 0.8862269254527579
44.1 µs ± 543 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8192 0.8862269254527579
10.3 µs ± 105 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
8193 0.886226925452758
10.4 µs ± 4.41 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
8194 0.886226925452758
更新:np.show_config()
输出:
mkl_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\Library\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\include', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\lib', 'C:/GNU/Anaconda3\Library\include']
blas_mkl_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\Library\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\include', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\lib', 'C:/GNU/Anaconda3\Library\include']
blas_opt_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\Library\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\include', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\lib', 'C:/GNU/Anaconda3\Library\include']
lapack_mkl_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\Library\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\include', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\lib', 'C:/GNU/Anaconda3\Library\include']
lapack_opt_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\Library\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\include', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\lib', 'C:/GNU/Anaconda3\Library\include']
正如@WarrenWeckesser 所写,"it's almost certainly an Anaconda & Intel MKL issue; cf. https://github.com/numpy/numpy/issues/11448 and https://github.com/ContinuumIO/anaconda-issues/issues/9129"。
不幸的是,在 Windows 下解决问题的唯一方法是卸载 Anaconda 并使用另一个不含 MKL 的发行版 numpy
。我使用了 https://www.python.org/ 的 python-3.6.6-amd64 并通过 pip
安装了所有其他东西,包括 numpy 1.14.5。我什至设法让 Spyder 工作(不得不将 PyQt5 降级到 5.11.3,它拒绝在 >= 5.12 上启动)。
现在 np.sin(xx)
始终为 0.7071067811865476(在 n = 8192
时为 67.1 µs)和 np.sqrt(xx)
0.8862269254527579(16.4 µs)。有点慢,但完全可以重现。
我发现 numpy.sin
在参数大小 <= 8192 和 > 8192 时表现不同。不同之处在于性能和返回值。谁能解释一下这个效果?
例如,让我们计算sin(pi/4):
x = np.pi*0.25
for n in range(8191, 8195):
xx = np.repeat(x, n)
%timeit np.sin(xx)
print(n, np.sin(xx)[0])
64.7 µs ± 194 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8191 0.7071067811865476
64.6 µs ± 166 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8192 0.7071067811865476
20.1 µs ± 189 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
8193 0.7071067811865475
21.8 µs ± 13.4 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8194 0.7071067811865475
超过 8192 个元素的限制后,计算速度提高了 3 倍以上,并给出了不同的结果:最后一位数字变为 5 而不是 6。
当我尝试以其他方式计算相同的值时,我得到:
- C++
std::sin
(Visual Studio 2017, Win32 平台) 给出 0.7071067811865475; - C++
std::sin
(Visual Studio 2017,x64 平台)给出 0.70710678118654756; math.sin
给出 0.7071067811865476,这是合乎逻辑的,因为我使用了 64 位 Python.
我在 NumPy 文档及其代码中找不到任何解释。
更新 #2:很难相信,但是用 sqrt
替换 sin
给出了这个:
44.2 µs ± 751 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8191 0.8862269254527579
44.1 µs ± 543 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8192 0.8862269254527579
10.3 µs ± 105 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
8193 0.886226925452758
10.4 µs ± 4.41 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
8194 0.886226925452758
更新:np.show_config()
输出:
mkl_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\Library\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\include', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\lib', 'C:/GNU/Anaconda3\Library\include']
blas_mkl_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\Library\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\include', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\lib', 'C:/GNU/Anaconda3\Library\include']
blas_opt_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\Library\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\include', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\lib', 'C:/GNU/Anaconda3\Library\include']
lapack_mkl_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\Library\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\include', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\lib', 'C:/GNU/Anaconda3\Library\include']
lapack_opt_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\Library\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\include', 'C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.0.117\windows\mkl\lib', 'C:/GNU/Anaconda3\Library\include']
正如@WarrenWeckesser 所写,"it's almost certainly an Anaconda & Intel MKL issue; cf. https://github.com/numpy/numpy/issues/11448 and https://github.com/ContinuumIO/anaconda-issues/issues/9129"。
不幸的是,在 Windows 下解决问题的唯一方法是卸载 Anaconda 并使用另一个不含 MKL 的发行版 numpy
。我使用了 https://www.python.org/ 的 python-3.6.6-amd64 并通过 pip
安装了所有其他东西,包括 numpy 1.14.5。我什至设法让 Spyder 工作(不得不将 PyQt5 降级到 5.11.3,它拒绝在 >= 5.12 上启动)。
现在 np.sin(xx)
始终为 0.7071067811865476(在 n = 8192
时为 67.1 µs)和 np.sqrt(xx)
0.8862269254527579(16.4 µs)。有点慢,但完全可以重现。