MATLAB ksdensity 等效于 Python

Question

我在网上看过，但还没有找到答案或方法来计算以下内容

我正在将一些 MATLAB 代码翻译成 Python，在 MATLAB 中，我希望使用以下函数找到核密度估计：

[p,x] = ksdensity(data)

其中 p 是分布中点 x 的概率。

Scipy 有函数但只有 returns p.

有没有办法找到 x 值处的概率？

谢谢！

Answer 1

另一个选项是 Scikit-Learn Python 包中的核密度估计器，sklearn.neighbors.KernelDensity

这是一个类似于高斯分布的 ksdensity 的 Matlab 文档的小示例：

import numpy as np
import matplotlib.pyplot as plt
from sklearn.neighbors import KernelDensity

np.random.seed(12345)
# similar to MATLAB ksdensity example x = [randn(30,1); 5+randn(30,1)];
Vecvalues=np.concatenate((np.random.normal(0,1,30), np.random.normal(5,1,30)))[:,None]
Vecpoints=np.linspace(-8,12,100)[:,None]
kde = KernelDensity(kernel='gaussian', bandwidth=0.5).fit(Vecvalues)
logkde = kde.score_samples(Vecpoints)
plt.plot(Vecpoints,np.exp(logkde))
plt.show()

生成的情节如下所示：

Answer 2

那种形式的ksdensity调用会自动生成一个任意的x。 scipy.stats.gaussian_kde() returns 一个可调用函数，可以用您选择的任何 x 进行计算。等效的 x 将是 np.linspace(data.min(), data.max(), 100).

import numpy as np
from scipy import stats

data = ...
kde = stats.gaussian_kde(data)
x = np.linspace(data.min(), data.max(), 100)
p = kde(x)

Answer 3

Matlab is orders of magnitude faster than KernelDensity when it comes to finding the optimal bandwidth. Any idea of how to make the KernelDenisty faster? – Yuca Jul 16 '18 at 20:58

嗨，尤卡。 matlab 使用 Scott rule to estimate the bandwidth, which is fast but requires the data from the normal distribution. For more information, please see this Post.

MATLAB ksdensity 等效于 Python

MATLAB ksdensity equivalent in Python

python

matlab

numpy

scipy