python 中的 TPOT 错误无法使用具有不同长度的切片索引器进行设置
TPOT error in python cannot set using a slice indexer with a different length
我正在尝试 运行 tpot 使用遗传算法优化随机森林的超参数。我收到一条错误消息,但不太确定如何修复它。下面是我正在使用的基本代码。
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV
from sklearn.model_selection import train_test_split
from tpot import TPOTClassifier
X = my_df_features
y = my_df_target
X_train, X_test, y_train, y_test = train_test_split(X,y, random_state=42)
model_parameters = {'n_estimators': [100,200],
"max_depth" : [None, 5, 10],
"max_features" : [10]}
# This seems to work perfectly fine when I run it
# model_tuned = GridSearchCV(RandomForestClassifier(),model_parameters, cv=5)
# This does not seem to work
model_tuned = TPOTClassifier(generations= 2, population_size= 2, offspring_size= 2,
verbosity= 2, early_stop= 10,
config_dict=
{'sklearn.ensemble.RandomForestClassifier': model_parameters},
cv = 5)
model_tuned.fit(X_train,y_train)
当使用 TPOT(而不是 RandomForest)时,上面的最后一行会产生以下错误:
ValueError: cannot set using a slice indexer with a different length than the value"
我尝试使用 iris 数据集进行 tpot,但没有收到任何错误
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV
from sklearn.model_selection import train_test_split
from tpot import TPOTClassifier
from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X,y, random_state=42)
model_parameters = {'n_estimators': [100,200],
"max_depth" : [None, 5, 10],
"max_features" : [len(X_train[0])]}
model_tuned = TPOTClassifier(generations= 2,
population_size= 2,
offspring_size= 2,
verbosity= 2,
early_stop= 10,
config_dict={'sklearn.ensemble.RandomForestClassifier':
model_parameters},
cv = 5)
model_tuned.fit(X_train,y_train)
我认为您的数据集的形状或类型有问题
可能是因为您正在使用 pandas DataFrames
尝试这样做:
X = X.to_numpy
y = y.to_numpy
我正在尝试 运行 tpot 使用遗传算法优化随机森林的超参数。我收到一条错误消息,但不太确定如何修复它。下面是我正在使用的基本代码。
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV
from sklearn.model_selection import train_test_split
from tpot import TPOTClassifier
X = my_df_features
y = my_df_target
X_train, X_test, y_train, y_test = train_test_split(X,y, random_state=42)
model_parameters = {'n_estimators': [100,200],
"max_depth" : [None, 5, 10],
"max_features" : [10]}
# This seems to work perfectly fine when I run it
# model_tuned = GridSearchCV(RandomForestClassifier(),model_parameters, cv=5)
# This does not seem to work
model_tuned = TPOTClassifier(generations= 2, population_size= 2, offspring_size= 2,
verbosity= 2, early_stop= 10,
config_dict=
{'sklearn.ensemble.RandomForestClassifier': model_parameters},
cv = 5)
model_tuned.fit(X_train,y_train)
当使用 TPOT(而不是 RandomForest)时,上面的最后一行会产生以下错误:
ValueError: cannot set using a slice indexer with a different length than the value"
我尝试使用 iris 数据集进行 tpot,但没有收到任何错误
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV
from sklearn.model_selection import train_test_split
from tpot import TPOTClassifier
from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X,y, random_state=42)
model_parameters = {'n_estimators': [100,200],
"max_depth" : [None, 5, 10],
"max_features" : [len(X_train[0])]}
model_tuned = TPOTClassifier(generations= 2,
population_size= 2,
offspring_size= 2,
verbosity= 2,
early_stop= 10,
config_dict={'sklearn.ensemble.RandomForestClassifier':
model_parameters},
cv = 5)
model_tuned.fit(X_train,y_train)
我认为您的数据集的形状或类型有问题
可能是因为您正在使用 pandas DataFrames
尝试这样做:
X = X.to_numpy
y = y.to_numpy