线性回归中误差函数的 3D 图

Question

我想直观地绘制为线性回归的给定斜率和 y 轴截距计算的误差函数的 3D 图。此图将用于说明梯度下降应用程序。

假设我们想用一条线对一组点建模。为此，我们将使用标准的 y=mx+b 线方程，其中 m 是线的斜率，b 是线的 y 轴截距。要为我们的数据找到最佳直线，我们需要找到一组最佳的斜率 m 和 y 轴截距 b 值。

解决此类问题的标准方法是定义一个误差函数（也称为成本函数）来衡量给定线条的“好”程度。此函数将接受 (m,b) 对和 return 一个基于直线与数据的拟合程度的误差值。为了计算给定直线的误差，我们将遍历数据集中的每个 (x,y) 点，并将每个点的 y 值与候选直线的 y 值（在 mx+b 处计算）之间的平方距离相加。通常对该距离求平方以确保它为正并使我们的误差函数可微分。在 python 中，计算给定行的错误将如下所示：

# y = mx + b
# m is slope, b is y-intercept
def computeErrorForLineGivenPoints(b, m, points):
    totalError = 0
    for i in range(0, len(points)):
        totalError += (points[i].y - (m * points[i].x + b)) ** 2
    return totalError / float(len(points))

由于误差函数由两个参数（m 和 b）组成，我们可以将其可视化为二维表面。

现在我的问题是，我们如何使用 python 绘制这样的 3D 图？

这是构建 3D 图的框架代码。此代码片段完全不在问题上下文中，但它显示了构建 3D 图的基础知识。对于我的例子，我需要 x 轴是斜率，y 轴是 y 截距，z 轴是误差。

有人可以帮我构建这样的图表示例吗？

import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import random

def fun(x, y):
  return x**2 + y

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
x = y = np.arange(-3.0, 3.0, 0.05)
X, Y = np.meshgrid(x, y)
zs = np.array([fun(x,y) for x,y in zip(np.ravel(X), np.ravel(Y))])
Z = zs.reshape(X.shape)

ax.plot_surface(X, Y, Z)

ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')

plt.show()

上面的代码产生了下面的情节，这与我正在寻找的非常相似。

Answer 1

只需将 fun 替换为 computeErrorForLineGivenPoints:

import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import collections

def error(m, b, points):
    totalError = 0
    for i in range(0, len(points)):
        totalError += (points[i].y - (m * points[i].x + b)) ** 2
    return totalError / float(len(points))

x = y = np.arange(-3.0, 3.0, 0.05)
Point = collections.namedtuple('Point', ['x', 'y'])

m, b = 3, 2
noise = np.random.random(x.size)
points = [Point(xp, m*xp+b+err) for xp,err in zip(x, noise)]

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

ms = np.linspace(2.0, 4.0, 10)
bs = np.linspace(1.5, 2.5, 10)

M, B = np.meshgrid(ms, bs)
zs = np.array([error(mp, bp, points) 
               for mp, bp in zip(np.ravel(M), np.ravel(B))])
Z = zs.reshape(M.shape)

ax.plot_surface(M, B, Z, rstride=1, cstride=1, color='b', alpha=0.5)

ax.set_xlabel('m')
ax.set_ylabel('b')
ax.set_zlabel('error')

plt.show()

产量

提示：我将 computeErrorForLineGivenPoints 重命名为 error。通常，没有必要将函数命名为 compute...，因为几乎所有函数都会计算一些东西。您也不需要指定 "GivenPoints"，因为函数签名表明 points 是一个参数。如果您的程序中有其他错误函数或变量，line_error 或 total_error 可能是此函数的更好名称。

线性回归中误差函数的 3D 图

3D-plot of the error function in a linear regression

python

plot

machine-learning

matplotlib

pandas