决策树回归产生多行

Decision tree regression producing multiple lines

我正在尝试使用决策树回归进行单变量回归。但是,当我绘制结果时。图中显示多条线,如下图所示。我用线性回归的时候没有遇到这个问题。
https://snipboard.io/v9QaoC.jpg - 我无法 post 图片,因为我的声望不到 10
我的代码:

import numpy as np
from sklearn.tree import DecisionTreeRegressor
import matplotlib.pyplot as plt



# Fit regression model
regr_1 = DecisionTreeRegressor(max_depth=2)
regr_2 = DecisionTreeRegressor(max_depth=5)
regr_1.fit(X_train.values.reshape(-1, 1), y_train.values.reshape(-1, 1))
regr_2.fit(X_train.values.reshape(-1, 1), y_train.values.reshape(-1, 1))

# Predict
y_1 = regr_1.predict(X_test.values.reshape(-1, 1))
y_2 = regr_2.predict(X_test.values.reshape(-1, 1))

# Plot the results
plt.figure()
plt.scatter(X_train, y_train, s=20, edgecolor="black", c="darkorange", label="data")
plt.plot(X_test, y_1, color="cornflowerblue", label="max_depth=2", linewidth=2)
plt.plot(X_test, y_2, color="yellowgreen", label="max_depth=5", linewidth=2)
plt.xlabel("data")
plt.ylabel("target")
plt.title("Decision Tree Regression")
plt.legend()
plt.show()

您的绘图可能没有吸引力,因为您的测试样本未排序,因此您 'connecting the dots' 随机地处于不同的测试数据点之间。这对于您的线性回归解决方案来说还不清楚,因为这些线是重叠的。

您可以通过对测试数据进行排序来获得您期望的情节:

# Sort
X_test = np.sort(X_test)  # Need to specify axis=0 if X_test has shape (n_samples, 0)

# Predict
y_1 = regr_1.predict(X_test.values.reshape(-1, 1))
y_2 = regr_2.predict(X_test.values.reshape(-1, 1))

# Plot the results
plt.figure()
plt.scatter(X_train, y_train, s=20, edgecolor="black", c="darkorange", label="data")
plt.plot(X_test, y_1, color="cornflowerblue", label="max_depth=2", linewidth=2)
plt.plot(X_test, y_2, color="yellowgreen", label="max_depth=5", linewidth=2)
plt.xlabel("data")
plt.ylabel("target")
plt.title("Decision Tree Regression")
plt.legend()
plt.show()