如何在 Python 中创建线性相关的随机数据？

Question

我正在尝试写一篇关于线性回归的博客，但一直坚持创建一个线性相关的随机数据集。

下面的这段代码帮助我获得了某种线性相关的随机数据，但我怎样才能使传播范围更广？

x = np.random.normal(3, 1, 100)
y = 0.77 * (x + np.random.normal(0, 0.1, 100)) + 0.66

plt.figure(figsize = (20,5))
plt.scatter(x, y)
plt.show()

Answer 1

如果您想要一些测试数据，根据定义您不希望它是随机的。但是你想让它嘈杂。实现这一点的一个好方法是 select 完善你线上的点，然后稍微移动它们使它们变得嘈杂。

x = np.linspace(0, 6)
y = np.linspace(0, 3)

noise_factor = 0.2

def noise(k):
   return k+((random.random()*2)-1)*noise_factor

x = np.vectorize(noise)(x)
y = np.vectorize(noise)(y)

如何在 Python 中创建线性相关的随机数据？

How can I create linearly related random data in Python?

python

random

linear-regression