std::poisson_distribution 中的 C++ 标准库中存在错误?
Bug in the C++ standard library in std::poisson_distribution?
我想我遇到了来自 C++ 标准库的 std::poisson_distribution 的错误行为。
问题:
- 你能确认这确实是一个错误,而不是我的错误吗?
- poisson_distribution函数的标准库代码究竟有什么问题,假设它确实是一个错误?
详情:
以下 C++ 代码(文件 poisson_test.cc)用于生成泊松分布数:
#include <array>
#include <cmath>
#include <iostream>
#include <random>
int main() {
// The problem turned out to be independent on the engine
std::mt19937_64 engine;
// Set fixed seed for easy reproducibility
// The problem turned out to be independent on seed
engine.seed(1);
std::poisson_distribution<int> distribution(157.17);
for (int i = 0; i < 1E8; i++) {
const int number = distribution(engine);
std::cout << number << std::endl;
}
}
我编译这段代码如下:
clang++ -o poisson_test -std=c++11 poisson_test.cc
./poisson_test > mypoisson.txt
以下 python 脚本用于分析文件 mypoisson.txt 中的随机数序列:
import numpy as np
import matplotlib.pyplot as plt
def expectation(x, m):
" Poisson pdf "
# Use Ramanujan formula to get ln n!
lnx = x * np.log(x) - x + 1./6. * np.log(x * (1 + 4*x*(1+2*x))) + 1./2. * np.log(np.pi)
return np.exp(x*np.log(m) - m - lnx)
data = np.loadtxt('mypoisson.txt', dtype = 'int')
unique, counts = np.unique(data, return_counts = True)
hist = counts.astype(float) / counts.sum()
stat_err = np.sqrt(counts) / counts.sum()
plt.errorbar(unique, hist, yerr = stat_err, fmt = '.', \
label = 'Poisson generated \n by std::poisson_distribution')
plt.plot(unique, expectation(unique, expected_mean), \
label = 'expected probability \n density function')
plt.legend()
plt.show()
# Determine bins with statistical significance of deviation larger than 3 sigma
deviation_in_sigma = (hist - expectation(unique, expected_mean)) / stat_err
d = dict((k, v) for k, v in zip(unique, deviation_in_sigma) if np.abs(v) > 3.0)
print d
该脚本产生以下情节:
肉眼就能看出问题所在。 n = 158 处的偏差具有统计显着性,实际上是 22σ 偏差!
上一个剧情特写
我的系统设置如下(Debian 测试):
libstdc++-7-dev:
Installed: 7.2.0-16
libc++-dev:
Installed: 3.5-2
clang:
Installed: 1:3.8-37
g++:
Installed: 4:7.2.0-1d1
我可以确认使用libstdc++
时的错误:
g++ -o pois_gcc -std=c++11 pois.cpp
clang++ -o pois_clang -std=c++11 -stdlib=libstdc++ pois.cpp
clang++ -o pois_clang_libc -std=c++11 -stdlib=libc++ pois.cpp
结果:
我想我遇到了来自 C++ 标准库的 std::poisson_distribution 的错误行为。
问题:
- 你能确认这确实是一个错误,而不是我的错误吗?
- poisson_distribution函数的标准库代码究竟有什么问题,假设它确实是一个错误?
详情:
以下 C++ 代码(文件 poisson_test.cc)用于生成泊松分布数:
#include <array>
#include <cmath>
#include <iostream>
#include <random>
int main() {
// The problem turned out to be independent on the engine
std::mt19937_64 engine;
// Set fixed seed for easy reproducibility
// The problem turned out to be independent on seed
engine.seed(1);
std::poisson_distribution<int> distribution(157.17);
for (int i = 0; i < 1E8; i++) {
const int number = distribution(engine);
std::cout << number << std::endl;
}
}
我编译这段代码如下:
clang++ -o poisson_test -std=c++11 poisson_test.cc
./poisson_test > mypoisson.txt
以下 python 脚本用于分析文件 mypoisson.txt 中的随机数序列:
import numpy as np
import matplotlib.pyplot as plt
def expectation(x, m):
" Poisson pdf "
# Use Ramanujan formula to get ln n!
lnx = x * np.log(x) - x + 1./6. * np.log(x * (1 + 4*x*(1+2*x))) + 1./2. * np.log(np.pi)
return np.exp(x*np.log(m) - m - lnx)
data = np.loadtxt('mypoisson.txt', dtype = 'int')
unique, counts = np.unique(data, return_counts = True)
hist = counts.astype(float) / counts.sum()
stat_err = np.sqrt(counts) / counts.sum()
plt.errorbar(unique, hist, yerr = stat_err, fmt = '.', \
label = 'Poisson generated \n by std::poisson_distribution')
plt.plot(unique, expectation(unique, expected_mean), \
label = 'expected probability \n density function')
plt.legend()
plt.show()
# Determine bins with statistical significance of deviation larger than 3 sigma
deviation_in_sigma = (hist - expectation(unique, expected_mean)) / stat_err
d = dict((k, v) for k, v in zip(unique, deviation_in_sigma) if np.abs(v) > 3.0)
print d
该脚本产生以下情节:
肉眼就能看出问题所在。 n = 158 处的偏差具有统计显着性,实际上是 22σ 偏差!
上一个剧情特写
我的系统设置如下(Debian 测试):
libstdc++-7-dev:
Installed: 7.2.0-16
libc++-dev:
Installed: 3.5-2
clang:
Installed: 1:3.8-37
g++:
Installed: 4:7.2.0-1d1
我可以确认使用libstdc++
时的错误:
g++ -o pois_gcc -std=c++11 pois.cpp
clang++ -o pois_clang -std=c++11 -stdlib=libstdc++ pois.cpp
clang++ -o pois_clang_libc -std=c++11 -stdlib=libc++ pois.cpp
结果: