模型上的 pyweka 警告
pyweka warning on a model
我的问题是为什么我会收到此警告:
java.beans.IntrospectionException: Method not found: isNumToSelect
java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:110)
java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:74)
weka.core.PropertyPath.find(PropertyPath.java:386)
weka.core.SetupGenerator.setup(SetupGenerator.java:499)
weka.classifiers.meta.multisearch.DefaultEvaluationTask.doRun(DefaultEvaluationTask.java:83)
weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:113)
weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:34)
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base/java.lang.Thread.run(Thread.java:829)
at java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:110)
at java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:74)
at weka.core.PropertyPath.find(PropertyPath.java:386)
at weka.core.SetupGenerator.setup(SetupGenerator.java:499)
at weka.classifiers.meta.multisearch.DefaultEvaluationTask.doRun(DefaultEvaluationTask.java:83)
at weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:113)
at weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:34)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
不知道如何解决它模型正确执行但在解决方案之前的很长时间内多次打印该警告。
编辑:
这是我的代码:
base_model_3 = Classifier(classname="weka.classifiers.trees.ADTree",
options=["-B", "10", "-E", "-3", "-S", "1"])
CostS_cls_model_3 = SingleClassifierEnhancer(classname="weka.classifiers.meta.CostSensitiveClassifier",
options =["-cost-matrix", "[0.0 2.0; 1.0 0.0]", "-S", "1"])
CostS_cls_model_3.classifier = base_model_3
ROS = Filter(classname="weka.filters.supervised.instance.Resample", options = ["-B","1","-Z","165"])
fc_model_3_ROS = FilteredClassifier(options=["-S","1"])
fc_model_3_ROS.filter = ROS
fc_model_3_ROS.classifier = CostS_cls_model_3
bagging_cls_model_3 = SingleClassifierEnhancer(classname="weka.classifiers.meta.Bagging",
options=["-P", "100", "-S", "1", "-num-slots", "1", "-I", "100"])
bagging_cls_model_3.classifier = fc_model_3_ROS
AttS_cls_model_3 = AttributeSelectedClassifier()
AttS_cls_model_3.search = from_commandline('weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N 61', classname=get_classname(ASSearch))
AttS_cls_model_3.evaluator = from_commandline('weka.attributeSelection.InfoGainAttributeEval', classname=get_classname(ASEvaluation))
AttS_cls_model_3.classifier = bagging_cls_model_3
multisearch_cls_model_3 = MultiSearch(options = ["-S", "1","-class-label","1"])
multisearch_cls_model_3.evaluation = "FM"
multisearch_cls_model_3.search = ["-sample-size", "100", "-initial-folds", "2", "-subsequent-folds", "10",
"-initial-test-set", ".", "-subsequent-test-set", ".", "-num-slots", "1"]
mparam_model_3 = MathParameter()
mparam_model_3.prop = "numToSelect"
mparam_model_3.minimum = 5.0
mparam_model_3.maximum = 134.0
mparam_model_3.step = 1.0
mparam_model_3.base = 10.0
mparam_model_3.expression = "I"
multisearch_cls_model_3.parameters = [mparam_model_3]
multisearch_cls_model_3.classifier = AttS_cls_model_3
MissingValues = Filter(classname="weka.filters.unsupervised.attribute.ReplaceMissingValues")
fc_model_3_MV = FilteredClassifier(options=["-S","1"])
fc_model_3_MV.filter = MissingValues
fc_model_3_MV.classifier = multisearch_cls_model_3
也许我不能使用“numToSelect” 有多重搜索属性的列表吗?
我还有一个问题,关于 sklearn-weka-plugin,存在任何使用 RandomizedSearchCV 或 GridSearch(来自 sklearn)的方法,以便在以 ADTrees 作为基础估计器的 Bagging 模型上很好地组合参数
类似的东西:
Base_CostS= WekaEstimator(classifier = base_model_1, classname="weka.classifiers.meta.CostSensitiveClassifier",
options =["-cost-matrix", "[0.0 1.0; 1.0 0.0]", "-S", "1", "-W", "weka.classifiers.trees.ADTree"],
nominal_input_vars=[2,3,4], # which attributes need to be treated as nominal
nominal_output_var=True) # class is nominal as well
bagging_model = BaggingClassifier(base_estimator = Base_CostS, n_estimators = 100, n_jobs = None, random_state = 1)
param_distributions_BG = {
'n_estimators': [10, 50, 75, 100],
'max_samples' : [0.2, 0.5, 1.0],
'bootstrap' : [True, False],
'base__iterations' : [10,15,20],
'base__Expand_Nodes' : ["-3", "-2", "-1", "1"]
}
# Búsqueda por validación cruzada
# ==============================================================================
grid_r = RandomizedSearchCV(
estimator = bagging_model,
param_distributions = param_distributions_BG,
n_iter = 50,
scoring = {'Precision':'precision_macro',
'Recall':'recall_macro',
'F1_Score':'f1_macro'},
cv = RepeatedKFold(n_splits = 5, n_repeats = 5),
verbose = 0,
random_state = 1,
return_train_score = True,
refit = refit_aux
)
我不知道是否可以做那样的事情,或者我必须做一些不同的事情,我也想看看“feature_importances_”,但我认为装袋模型没有那个,这个“feature_importances_”的目的是用 SHAP
对其进行分析
MultiSearch 使用 属性 您定义的路径(参数对象的 .prop
属性)在嵌套的 Java 对象中查找对象以应用参数,它不会 need/have 它可以优化的预定义属性列表。根据您嵌套分类器、过滤器、属性选择的方式,您必须调整此路径。
在您的设置中,您有以下嵌套:
MultiSearch
|
+- AttributeSelectedClassifier
|
+- Ranker
|
+- InfoGainAttributeEval
任何 属性 路径都将应用于您为 MultiSearch 指定的分类器。如果您使用 numToSelect
,则 MultiSearch 将在您的 AttributeSelectedClassifier
中查找此 Java 属性。由于这是 Ranker
对象的 属性,因此无法找到它。 Ranker
对象本身可以通过 AttributeSelectedClassifier
中的 search
属性 访问。换句话说,您需要使用 search.numToSelect
作为您的 属性 路径。
我的问题是为什么我会收到此警告:
java.beans.IntrospectionException: Method not found: isNumToSelect
java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:110)
java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:74)
weka.core.PropertyPath.find(PropertyPath.java:386)
weka.core.SetupGenerator.setup(SetupGenerator.java:499)
weka.classifiers.meta.multisearch.DefaultEvaluationTask.doRun(DefaultEvaluationTask.java:83)
weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:113)
weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:34)
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base/java.lang.Thread.run(Thread.java:829)
at java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:110)
at java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:74)
at weka.core.PropertyPath.find(PropertyPath.java:386)
at weka.core.SetupGenerator.setup(SetupGenerator.java:499)
at weka.classifiers.meta.multisearch.DefaultEvaluationTask.doRun(DefaultEvaluationTask.java:83)
at weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:113)
at weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:34)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
不知道如何解决它模型正确执行但在解决方案之前的很长时间内多次打印该警告。
编辑:
这是我的代码:
base_model_3 = Classifier(classname="weka.classifiers.trees.ADTree",
options=["-B", "10", "-E", "-3", "-S", "1"])
CostS_cls_model_3 = SingleClassifierEnhancer(classname="weka.classifiers.meta.CostSensitiveClassifier",
options =["-cost-matrix", "[0.0 2.0; 1.0 0.0]", "-S", "1"])
CostS_cls_model_3.classifier = base_model_3
ROS = Filter(classname="weka.filters.supervised.instance.Resample", options = ["-B","1","-Z","165"])
fc_model_3_ROS = FilteredClassifier(options=["-S","1"])
fc_model_3_ROS.filter = ROS
fc_model_3_ROS.classifier = CostS_cls_model_3
bagging_cls_model_3 = SingleClassifierEnhancer(classname="weka.classifiers.meta.Bagging",
options=["-P", "100", "-S", "1", "-num-slots", "1", "-I", "100"])
bagging_cls_model_3.classifier = fc_model_3_ROS
AttS_cls_model_3 = AttributeSelectedClassifier()
AttS_cls_model_3.search = from_commandline('weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N 61', classname=get_classname(ASSearch))
AttS_cls_model_3.evaluator = from_commandline('weka.attributeSelection.InfoGainAttributeEval', classname=get_classname(ASEvaluation))
AttS_cls_model_3.classifier = bagging_cls_model_3
multisearch_cls_model_3 = MultiSearch(options = ["-S", "1","-class-label","1"])
multisearch_cls_model_3.evaluation = "FM"
multisearch_cls_model_3.search = ["-sample-size", "100", "-initial-folds", "2", "-subsequent-folds", "10",
"-initial-test-set", ".", "-subsequent-test-set", ".", "-num-slots", "1"]
mparam_model_3 = MathParameter()
mparam_model_3.prop = "numToSelect"
mparam_model_3.minimum = 5.0
mparam_model_3.maximum = 134.0
mparam_model_3.step = 1.0
mparam_model_3.base = 10.0
mparam_model_3.expression = "I"
multisearch_cls_model_3.parameters = [mparam_model_3]
multisearch_cls_model_3.classifier = AttS_cls_model_3
MissingValues = Filter(classname="weka.filters.unsupervised.attribute.ReplaceMissingValues")
fc_model_3_MV = FilteredClassifier(options=["-S","1"])
fc_model_3_MV.filter = MissingValues
fc_model_3_MV.classifier = multisearch_cls_model_3
也许我不能使用“numToSelect” 有多重搜索属性的列表吗?
我还有一个问题,关于 sklearn-weka-plugin,存在任何使用 RandomizedSearchCV 或 GridSearch(来自 sklearn)的方法,以便在以 ADTrees 作为基础估计器的 Bagging 模型上很好地组合参数
类似的东西:
Base_CostS= WekaEstimator(classifier = base_model_1, classname="weka.classifiers.meta.CostSensitiveClassifier",
options =["-cost-matrix", "[0.0 1.0; 1.0 0.0]", "-S", "1", "-W", "weka.classifiers.trees.ADTree"],
nominal_input_vars=[2,3,4], # which attributes need to be treated as nominal
nominal_output_var=True) # class is nominal as well
bagging_model = BaggingClassifier(base_estimator = Base_CostS, n_estimators = 100, n_jobs = None, random_state = 1)
param_distributions_BG = {
'n_estimators': [10, 50, 75, 100],
'max_samples' : [0.2, 0.5, 1.0],
'bootstrap' : [True, False],
'base__iterations' : [10,15,20],
'base__Expand_Nodes' : ["-3", "-2", "-1", "1"]
}
# Búsqueda por validación cruzada
# ==============================================================================
grid_r = RandomizedSearchCV(
estimator = bagging_model,
param_distributions = param_distributions_BG,
n_iter = 50,
scoring = {'Precision':'precision_macro',
'Recall':'recall_macro',
'F1_Score':'f1_macro'},
cv = RepeatedKFold(n_splits = 5, n_repeats = 5),
verbose = 0,
random_state = 1,
return_train_score = True,
refit = refit_aux
)
我不知道是否可以做那样的事情,或者我必须做一些不同的事情,我也想看看“feature_importances_”,但我认为装袋模型没有那个,这个“feature_importances_”的目的是用 SHAP
对其进行分析MultiSearch 使用 属性 您定义的路径(参数对象的 .prop
属性)在嵌套的 Java 对象中查找对象以应用参数,它不会 need/have 它可以优化的预定义属性列表。根据您嵌套分类器、过滤器、属性选择的方式,您必须调整此路径。
在您的设置中,您有以下嵌套:
MultiSearch
|
+- AttributeSelectedClassifier
|
+- Ranker
|
+- InfoGainAttributeEval
任何 属性 路径都将应用于您为 MultiSearch 指定的分类器。如果您使用 numToSelect
,则 MultiSearch 将在您的 AttributeSelectedClassifier
中查找此 Java 属性。由于这是 Ranker
对象的 属性,因此无法找到它。 Ranker
对象本身可以通过 AttributeSelectedClassifier
中的 search
属性 访问。换句话说,您需要使用 search.numToSelect
作为您的 属性 路径。