在 sklearn 中生成各向异性数据

generate anisotropic data in sklearn

在 sklearn 文档中,他们提供了将正态分布数据块转换为各向异性分布数据的代码,如下所示

transformation = [[0.60834549, -0.63667341], [-0.40887718, 0.85253229]]
X_aniso = np.dot(X, transformation)

link 到代码 here

我想知道对应于转换矩阵中第 th 个条目的函数是什么。或者一般来说,如何将各向同性高斯斑点转换为各向异性斑点?

有人可以帮忙吗?

函数是某种linear transformation, you can get the concrete angle and scale of the operations using formulae described here

如果你想制作一个 blob 各向异性,你需要沿着一个维度剪切它以将它转换成某种椭圆体。

例如二维:

from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt
import numpy as np

fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(10, 5))

n_samples = 1500
random_state = 170
X, y = make_blobs(n_samples=n_samples,
                  random_state=random_state, center_box=(0, 20))
ax1.scatter(X[:, 0], X[:, 1], c=y)
ax1.set_title('default')


theta = np.radians(60)
t = np.tan(theta)
shear_x = np.array(((1, t), (0, 1))).T


X_rotated = X.dot(shear_x)
ax2.scatter(X_rotated[:, 0], X_rotated[:, 1], c=y)
ax2.set_title('%1.f degrees X shearing' % np.degrees(theta))


theta = np.radians(70)
t = np.tan(theta)

shear_y = np.array(((1, 0), (t, 1))).T

X_rotated = X.dot(shear_y)
ax3.scatter(X_rotated[:, 0], X_rotated[:, 1], c=y)
ax3.set_title('%1.f degrees Y shearing' % np.degrees(theta))
plt.tight_layout()