LightGBM 错误 - 长度与数据不同
LightGBM Error - length not same as data
我正在使用 lightGBM 查找特征重要性,但出现错误 LightGBMError: b'len of label is not same with #data'
。
X.shape
(73147, 12)
y.shape
(73147,)
代码:
from sklearn.model_selection import train_test_split
import lightgbm as lgb
# Initialize an empty array to hold feature importances
feature_importances = np.zeros(X.shape[1])
# Create the model with several hyperparameters
model = lgb.LGBMClassifier(objective='binary', boosting_type = 'goss', n_estimators = 10000, class_weight = 'balanced')
# Fit the model twice to avoid overfitting
for i in range(2):
# Split into training and validation set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = i)
# Train using early stopping
model.fit(X, y_train, early_stopping_rounds=100, eval_set = [(X_test, y_test)],
eval_metric = 'auc', verbose = 200)
# Record the feature importances
feature_importances += model.feature_importances_
见下方截图:
您的代码中似乎有错字;而不是
model.fit(X, y_train, [...])
应该是
model.fit(X_train, y_train, [...])
现在这样,可以理解X
和y_train
的长度不一样,所以你的错误。
我正在使用 lightGBM 查找特征重要性,但出现错误 LightGBMError: b'len of label is not same with #data'
。
X.shape
(73147, 12)
y.shape
(73147,)
代码:
from sklearn.model_selection import train_test_split
import lightgbm as lgb
# Initialize an empty array to hold feature importances
feature_importances = np.zeros(X.shape[1])
# Create the model with several hyperparameters
model = lgb.LGBMClassifier(objective='binary', boosting_type = 'goss', n_estimators = 10000, class_weight = 'balanced')
# Fit the model twice to avoid overfitting
for i in range(2):
# Split into training and validation set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = i)
# Train using early stopping
model.fit(X, y_train, early_stopping_rounds=100, eval_set = [(X_test, y_test)],
eval_metric = 'auc', verbose = 200)
# Record the feature importances
feature_importances += model.feature_importances_
见下方截图:
您的代码中似乎有错字;而不是
model.fit(X, y_train, [...])
应该是
model.fit(X_train, y_train, [...])
现在这样,可以理解X
和y_train
的长度不一样,所以你的错误。