模型上的 pyweka 警告

pyweka warning on a model

我的问题是为什么我会收到此警告:

java.beans.IntrospectionException: Method not found: isNumToSelect
    java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:110)
    java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:74)
    weka.core.PropertyPath.find(PropertyPath.java:386)
    weka.core.SetupGenerator.setup(SetupGenerator.java:499)
    weka.classifiers.meta.multisearch.DefaultEvaluationTask.doRun(DefaultEvaluationTask.java:83)
    weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:113)
    weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:34)
    java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    java.base/java.lang.Thread.run(Thread.java:829)

    at java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:110)
    at java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:74)
    at weka.core.PropertyPath.find(PropertyPath.java:386)
    at weka.core.SetupGenerator.setup(SetupGenerator.java:499)
    at weka.classifiers.meta.multisearch.DefaultEvaluationTask.doRun(DefaultEvaluationTask.java:83)
    at weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:113)
    at weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:34)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)

不知道如何解决它模型正确执行但在解决方案之前的很长时间内多次打印该警告。


编辑:

这是我的代码:

base_model_3 = Classifier(classname="weka.classifiers.trees.ADTree", 
                  options=["-B", "10", "-E", "-3", "-S", "1"])


CostS_cls_model_3 = SingleClassifierEnhancer(classname="weka.classifiers.meta.CostSensitiveClassifier", 
                                options =["-cost-matrix", "[0.0 2.0; 1.0 0.0]", "-S", "1"])
CostS_cls_model_3.classifier = base_model_3


ROS = Filter(classname="weka.filters.supervised.instance.Resample", options = ["-B","1","-Z","165"])
fc_model_3_ROS = FilteredClassifier(options=["-S","1"])
fc_model_3_ROS.filter = ROS
fc_model_3_ROS.classifier = CostS_cls_model_3


bagging_cls_model_3 = SingleClassifierEnhancer(classname="weka.classifiers.meta.Bagging",
                         options=["-P", "100", "-S", "1", "-num-slots", "1", "-I", "100"])
bagging_cls_model_3.classifier = fc_model_3_ROS


AttS_cls_model_3 = AttributeSelectedClassifier()
AttS_cls_model_3.search = from_commandline('weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N 61', classname=get_classname(ASSearch))
AttS_cls_model_3.evaluator = from_commandline('weka.attributeSelection.InfoGainAttributeEval', classname=get_classname(ASEvaluation))
AttS_cls_model_3.classifier = bagging_cls_model_3


multisearch_cls_model_3 = MultiSearch(options = ["-S", "1","-class-label","1"])
multisearch_cls_model_3.evaluation = "FM"
multisearch_cls_model_3.search = ["-sample-size", "100", "-initial-folds", "2", "-subsequent-folds", "10",
                          "-initial-test-set", ".", "-subsequent-test-set", ".", "-num-slots", "1"]                        
mparam_model_3 = MathParameter()
mparam_model_3.prop = "numToSelect"
mparam_model_3.minimum = 5.0
mparam_model_3.maximum = 134.0
mparam_model_3.step = 1.0
mparam_model_3.base = 10.0
mparam_model_3.expression = "I"
multisearch_cls_model_3.parameters = [mparam_model_3]
multisearch_cls_model_3.classifier = AttS_cls_model_3


MissingValues = Filter(classname="weka.filters.unsupervised.attribute.ReplaceMissingValues")
fc_model_3_MV = FilteredClassifier(options=["-S","1"])
fc_model_3_MV.filter = MissingValues
fc_model_3_MV.classifier = multisearch_cls_model_3

也许我不能使用“numToSelect” 有多重搜索属性的列表吗?

我还有一个问题,关于 sklearn-weka-plugin,存在任何使用 RandomizedSearchCV 或 GridSearch(来自 sklearn)的方法,以便在以 ADTrees 作为基础估计器的 Bagging 模型上很好地组合参数

类似的东西:

Base_CostS= WekaEstimator(classifier = base_model_1, classname="weka.classifiers.meta.CostSensitiveClassifier", 
                            options =["-cost-matrix", "[0.0 1.0; 1.0 0.0]", "-S", "1", "-W", "weka.classifiers.trees.ADTree"],
                            nominal_input_vars=[2,3,4], # which attributes need to be treated as nominal
                            nominal_output_var=True)    # class is nominal as well

bagging_model = BaggingClassifier(base_estimator = Base_CostS, n_estimators = 100, n_jobs = None, random_state = 1)

param_distributions_BG = {
    'n_estimators': [10, 50, 75, 100],
    'max_samples'   : [0.2, 0.5, 1.0],
    'bootstrap'   : [True, False],
    'base__iterations' : [10,15,20],
    'base__Expand_Nodes' : ["-3", "-2", "-1", "1"]

    
}

# Búsqueda por validación cruzada
# ==============================================================================
grid_r = RandomizedSearchCV(
        estimator  = bagging_model,
        param_distributions = param_distributions_BG,
        n_iter     = 50,
        scoring = {'Precision':'precision_macro',
                   'Recall':'recall_macro',
                   'F1_Score':'f1_macro'},
        cv         = RepeatedKFold(n_splits = 5, n_repeats = 5), 
        verbose    = 0,
        random_state = 1,
        return_train_score = True,
        refit = refit_aux
       )

我不知道是否可以做那样的事情,或者我必须做一些不同的事情,我也想看看“feature_importances_”,但我认为装袋模型没有那个,这个“feature_importances_”的目的是用 SHAP

对其进行分析

MultiSearch 使用 属性 您定义的路径(参数对象的 .prop 属性)在嵌套的 Java 对象中查找对象以应用参数,它不会 need/have 它可以优化的预定义属性列表。根据您嵌套分类器、过滤器、属性选择的方式,您必须调整此路径。

在您的设置中,您有以下嵌套:

MultiSearch
|
+- AttributeSelectedClassifier
   |
   +- Ranker
   |
   +- InfoGainAttributeEval

任何 属性 路径都将应用于您为 MultiSearch 指定的分类器。如果您使用 numToSelect,则 MultiSearch 将在您的 AttributeSelectedClassifier 中查找此 Java 属性。由于这是 Ranker 对象的 属性,因此无法找到它。 Ranker 对象本身可以通过 AttributeSelectedClassifier 中的 search 属性 访问。换句话说,您需要使用 search.numToSelect 作为您的 属性 路径。