Seaborn regplot y_test 和预测的样本量

Question

我对 Python 比较陌生。我正在尝试在 Seaborn regplot 中为我的回归模型绘制 y_test 和预测，但它会导致过度绘制。我试图从我的 df（信用）中抽样，但抽样不起作用。这是我的代码：

# modeling
algo = XGBRegressor(n_estimators=50, max_depth=5)
model = algo.fit(X_train, y_train)

# predictions
preds = model.predict(X_test)

# sampling
data_sample = credit.sample(100)

# plotting results
sns.set_style('ticks')
sns.regplot(y_test, preds, data=data_sample, fit_reg=True, scatter_kws={'color': 'darkred', 'alpha': 0.3, 's': 100})

关于如何调用 y_test 和预测样本的任何想法？泰

Answer 1

当你没有在 sns.regplot 中引用它时，你正在使用对象 y_test，你需要对包含两个变量的数据框进行子集化，例如：

import xgboost as xgb
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt
import seaborn as sns 
import numpy as np

boston = load_boston() 
X_train, X_test, y_train, y_test=train_test_split(boston.data, boston.target, test_size=0.15)

# modeling
algo = xgb.XGBRegressor(n_estimators=50, max_depth=5)
model = algo.fit(X_train, y_train)

我创建了一个包含所有测试和预测的 data.frame：

     preds = model.predict(X_test)
plotDa = pd.DataFrame({'y_test':y_test,'preds':preds})

sns.set_style('ticks')
sns.regplot(x='y_test',y='preds', data=plotDa.sample(10), fit_reg=True, scatter_kws={'color': 'darkred', 'alpha': 0.3, 's': 100})

或者你可以创建一个索引，然后用它来绘制：

subsample = np.random.choice(len(preds),10)
sns.regplot(y_test[subsample],preds[subsample], fit_reg=True)

Seaborn regplot y_test 和预测的样本量

Sample size for Seaborn regplot y_test and predictions

random

plot

sample

seaborn