LightGBMError: Do not support special JSON characters in feature name - The same code is working in jupyter but doesn't work in Spyder
LightGBMError: Do not support special JSON characters in feature name - The same code is working in jupyter but doesn't work in Spyder
我有以下代码:
most_important = features_importance_chi(importance_score_tresh,
df_user.drop(columns = 'CHURN'),churn)
X = df_user.drop(columns = 'CHURN')
churn[churn==2] = 1
y = churn
# handle undersample problem
X,y = handle_undersampe(X,y)
# train the model
X=X.loc[:,X.columns.isin(most_important)].values
y=y.values
parameters = {
'application': 'binary',
'objective': 'binary',
'metric': 'auc',
'is_unbalance': 'true',
'boosting': 'gbdt',
'num_leaves': 31,
'feature_fraction': 0.5,
'bagging_fraction': 0.5,
'bagging_freq': 20,
'learning_rate': 0.05,
'verbose': 0
}
# split data
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
train_data = lightgbm.Dataset(x_train, label=y_train)
test_data = lightgbm.Dataset(x_test, label=y_test)
model = lightgbm.train(parameters,
train_data,
valid_sets=[train_data, test_data],
**feature_name=most_important,**
num_boost_round=5000,
early_stopping_rounds=100)
和 returns most_important 参数
的函数
def features_importance_chi(importance_score_tresh, X, Y):
model = ExtraTreesClassifier(n_estimators=10)
model.fit(X,Y.values.ravel())
feature_list = pd.Series(model.feature_importances_,
index=X.columns)
feature_list = feature_list[feature_list > importance_score_tresh]
feature_list = feature_list.index.values.tolist()
return feature_list
有趣的是这段代码在Spyder中returns出现如下错误
LightGBMError: Do not support special JSON characters in feature name.
但在 jupyter 中工作正常。我可以打印最重要的功能列表。
知道这个错误的原因是什么吗?
你知道吗,这个消息经常出现在 LGBMClassifier () 模型上,即 LGBM。
从 pandas 上传数据后,只需将此行放在开头即可,您的头脑有问题:
import re
df = df.rename(columns = lambda x:re.sub('[^A-Za-z0-9_]+', '', x))
我有以下代码:
most_important = features_importance_chi(importance_score_tresh,
df_user.drop(columns = 'CHURN'),churn)
X = df_user.drop(columns = 'CHURN')
churn[churn==2] = 1
y = churn
# handle undersample problem
X,y = handle_undersampe(X,y)
# train the model
X=X.loc[:,X.columns.isin(most_important)].values
y=y.values
parameters = {
'application': 'binary',
'objective': 'binary',
'metric': 'auc',
'is_unbalance': 'true',
'boosting': 'gbdt',
'num_leaves': 31,
'feature_fraction': 0.5,
'bagging_fraction': 0.5,
'bagging_freq': 20,
'learning_rate': 0.05,
'verbose': 0
}
# split data
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
train_data = lightgbm.Dataset(x_train, label=y_train)
test_data = lightgbm.Dataset(x_test, label=y_test)
model = lightgbm.train(parameters,
train_data,
valid_sets=[train_data, test_data],
**feature_name=most_important,**
num_boost_round=5000,
early_stopping_rounds=100)
和 returns most_important 参数
的函数def features_importance_chi(importance_score_tresh, X, Y):
model = ExtraTreesClassifier(n_estimators=10)
model.fit(X,Y.values.ravel())
feature_list = pd.Series(model.feature_importances_,
index=X.columns)
feature_list = feature_list[feature_list > importance_score_tresh]
feature_list = feature_list.index.values.tolist()
return feature_list
有趣的是这段代码在Spyder中returns出现如下错误
LightGBMError: Do not support special JSON characters in feature name.
但在 jupyter 中工作正常。我可以打印最重要的功能列表。
知道这个错误的原因是什么吗?
你知道吗,这个消息经常出现在 LGBMClassifier () 模型上,即 LGBM。 从 pandas 上传数据后,只需将此行放在开头即可,您的头脑有问题:
import re
df = df.rename(columns = lambda x:re.sub('[^A-Za-z0-9_]+', '', x))