为什么我会收到此错误? python-weka-包装器 3
Why i'm getting this error? python-weka-wrapper 3
几天前我执行了这段代码并且运行正常,但现在它开始向我显示错误:
----------------------------------------------------------------------------
TEST MODEL
----------------------------------------------------------------------------
Encountered exception while evaluating classifier, skipping!
- Classifier: weka.classifiers.meta.AttributeSelectedClassifier -E "weka.attributeSelection.InfoGainAttributeEval " -S "weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N 5" -W weka.classifiers.meta.Bagging -- -P 100 -S 1 -num-slots 1 -I 100 -W weka.classifiers.meta.FilteredClassifier -- -F "weka.filters.supervised.instance.SMOTE -C 0 -K 3 -P 250.0 -S 1" -S 1 -W weka.classifiers.meta.CostSensitiveClassifier -- -cost-matrix "[0.0 2.0; 1.0 0.0]" -S 1 -W weka.classifiers.trees.ADTree -- -B 10 -E -3 -S 1
java.lang.Exception: Cannot use 0 neighbors!
weka.filters.supervised.instance.SMOTE.doSMOTE(SMOTE.java:539)
weka.filters.supervised.instance.SMOTE.batchFinished(SMOTE.java:489)
weka.filters.Filter.useFilter(Filter.java:708)
weka.classifiers.meta.FilteredClassifier.setUp(FilteredClassifier.java:719)
weka.classifiers.meta.FilteredClassifier.buildClassifier(FilteredClassifier.java:794)
weka.classifiers.ParallelIteratedSingleClassifierEnhancer.buildClassifiers(ParallelIteratedSingleClassifierEnhancer.java:229)
weka.classifiers.meta.Bagging.buildClassifier(Bagging.java:709)
weka.classifiers.meta.AttributeSelectedClassifier.buildClassifier(AttributeSelectedClassifier.java:513)
weka.classifiers.evaluation.Evaluation.crossValidateModel(Evaluation.java:843)
weka.classifiers.Evaluation.crossValidateModel(Evaluation.java:392)
weka.classifiers.meta.multisearch.DefaultEvaluationTask.doRun(DefaultEvaluationTask.java:98)
weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:113)
weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:34)
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base/java.lang.Thread.run(Thread.java:829)
at weka.filters.supervised.instance.SMOTE.doSMOTE(SMOTE.java:539)
at weka.filters.supervised.instance.SMOTE.batchFinished(SMOTE.java:489)
at weka.filters.Filter.useFilter(Filter.java:708)
at weka.classifiers.meta.FilteredClassifier.setUp(FilteredClassifier.java:719)
at weka.classifiers.meta.FilteredClassifier.buildClassifier(FilteredClassifier.java:794)
at weka.classifiers.ParallelIteratedSingleClassifierEnhancer.buildClassifiers(ParallelIteratedSingleClassifierEnhancer.java:229)
at weka.classifiers.meta.Bagging.buildClassifier(Bagging.java:709)
at weka.classifiers.meta.AttributeSelectedClassifier.buildClassifier(AttributeSelectedClassifier.java:513)
at weka.classifiers.evaluation.Evaluation.crossValidateModel(Evaluation.java:843)
at weka.classifiers.Evaluation.crossValidateModel(Evaluation.java:392)
at weka.classifiers.meta.multisearch.DefaultEvaluationTask.doRun(DefaultEvaluationTask.java:98)
at weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:113)
at weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:34)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
这是我的代码:
base_model_3 = Classifier(classname="weka.classifiers.trees.ADTree",
options=["-B", "10", "-E", "-3", "-S", "1"])
CostS_cls_model_3 = SingleClassifierEnhancer(classname="weka.classifiers.meta.CostSensitiveClassifier",
options =["-cost-matrix", "[0.0 2.0; 1.0 0.0]", "-S", "1"])
CostS_cls_model_3.classifier = base_model_3
smote_model_3 = Filter(classname="weka.filters.supervised.instance.SMOTE",
options=["-C", "0", "-K", "3", "-P", "250.0", "-S", "1"])
fc_model_3 = FilteredClassifier(options=["-S","1"])
fc_model_3.filter = smote_model_3
fc_model_3.classifier = CostS_cls_model_3
bagging_cls_model_3 = SingleClassifierEnhancer(classname="weka.classifiers.meta.Bagging",
options=["-P", "100", "-S", "1", "-num-slots", "1", "-I", "100"])
bagging_cls_model_3.classifier = fc_model_3
AttS_cls_model_3 = AttributeSelectedClassifier()
AttS_cls_model_3.search = from_commandline('weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N 61', classname=get_classname(ASSearch))
AttS_cls_model_3.evaluator = from_commandline('weka.attributeSelection.InfoGainAttributeEval', classname=get_classname(ASEvaluation))
AttS_cls_model_3.classifier = bagging_cls_model_3
multisearch_cls_model_3 = MultiSearch(options = ["-S", "1","-class-label","1"])
multisearch_cls_model_3.evaluation = "FM"
multisearch_cls_model_3.search = ["-sample-size", "100", "-initial-folds", "2", "-subsequent-folds", "10",
"-initial-test-set", ".", "-subsequent-test-set", ".", "-num-slots", "1"]
mparam_model_3 = MathParameter()
mparam_model_3.prop = "search.numToSelect"
mparam_model_3.minimum = 5.0
mparam_model_3.maximum = 134.0
mparam_model_3.step = 1.0
mparam_model_3.base = 10.0
mparam_model_3.expression = "I"
multisearch_cls_model_3.parameters = [mparam_model_3]
multisearch_cls_model_3.classifier = AttS_cls_model_3
MissingValues = Filter(classname="weka.filters.unsupervised.attribute.ReplaceMissingValues")
fc_model_3_MV = FilteredClassifier(options=["-S","1"])
fc_model_3_MV.filter = MissingValues
fc_model_3_MV.classifier = multisearch_cls_model_3
print("----------------------------------------------------------------------------")
print("CROSS VALIDATION")
print("----------------------------------------------------------------------------")
evl_model_3 = Evaluation(data_modelo_3_encoded)
evl_model_3.crossvalidate_model(fc_model_3_MV, data_modelo_3_encoded, 10, Random(1))
print(evl_model_3.summary())
conf_matrix = evl_model_3.confusion_matrix
plt.figure(figsize=(8,8))
sns.heatmap(conf_matrix, xticklabels = Labels, yticklabels = Labels, annot = True, fmt = "f", linewidth = 2)
plt.title("Confusion Matrix")
plt.ylabel("True Class")
plt.xlabel("Predicted Class")
plt.show()
print("----------------------------------------------------------------------------")
print("TEST MODEL")
print("----------------------------------------------------------------------------")
train_model_3, test_model_3 = data_modelo_3_encoded.train_test_split(70.0, Random(1))
fc_model_3_MV.build_classifier(train_model_3)
evl_model_3 = Evaluation(test_model_3)
evl_model_3.test_model(fc_model_3_MV, test_model_3)
print("")
print("=== Setup ===")
print("Classifier: ")
print(fc_model_3_MV.to_commandline())
print("----------------------------------------------------------------------------")
print("Dataset: ")
print(test_model_3.relationname)
print("----------------------------------------------------------------------------")
print("")
print(evl_model_3.summary("=== " + str(10) + " -fold Cross-Validation ==="))
print("----------------------------------------------------------------------------")
print(evl_model_3.class_details())
print("----------------------------------------------------------------------------")
plcls.plot_roc(evl_model_3, class_index=[0, 1], wait=True)
print("----------------------------------------------------------------------------")
conf_matrix = evl_model_3.confusion_matrix
plt.figure(figsize=(8,8))
sns.heatmap(conf_matrix, xticklabels = Labels, yticklabels = Labels, annot = True, fmt = "f", linewidth = 2)
plt.title("Confusion Matrix")
plt.ylabel("True Class")
plt.xlabel("Predicted Class")
plt.show()
它在“测试模型”区域之前一直正常工作,但我没有做任何更改,只是添加了一个混淆矩阵来显示结果,但不明白为什么会出现该错误。
此模型的数据库不平衡,但以前没有出现过此错误。
SMOTE 可能对每个 class 有足够的实例非常敏感(我对过滤器的内部工作原理一无所知)。在使用 UCI 数据集时,我能够重现这个问题。但是,当我将 train/test 拆分更改为 90/10(正如您的 cross-validation 使用的那样)时,它起作用了。
几天前我执行了这段代码并且运行正常,但现在它开始向我显示错误:
----------------------------------------------------------------------------
TEST MODEL
----------------------------------------------------------------------------
Encountered exception while evaluating classifier, skipping!
- Classifier: weka.classifiers.meta.AttributeSelectedClassifier -E "weka.attributeSelection.InfoGainAttributeEval " -S "weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N 5" -W weka.classifiers.meta.Bagging -- -P 100 -S 1 -num-slots 1 -I 100 -W weka.classifiers.meta.FilteredClassifier -- -F "weka.filters.supervised.instance.SMOTE -C 0 -K 3 -P 250.0 -S 1" -S 1 -W weka.classifiers.meta.CostSensitiveClassifier -- -cost-matrix "[0.0 2.0; 1.0 0.0]" -S 1 -W weka.classifiers.trees.ADTree -- -B 10 -E -3 -S 1
java.lang.Exception: Cannot use 0 neighbors!
weka.filters.supervised.instance.SMOTE.doSMOTE(SMOTE.java:539)
weka.filters.supervised.instance.SMOTE.batchFinished(SMOTE.java:489)
weka.filters.Filter.useFilter(Filter.java:708)
weka.classifiers.meta.FilteredClassifier.setUp(FilteredClassifier.java:719)
weka.classifiers.meta.FilteredClassifier.buildClassifier(FilteredClassifier.java:794)
weka.classifiers.ParallelIteratedSingleClassifierEnhancer.buildClassifiers(ParallelIteratedSingleClassifierEnhancer.java:229)
weka.classifiers.meta.Bagging.buildClassifier(Bagging.java:709)
weka.classifiers.meta.AttributeSelectedClassifier.buildClassifier(AttributeSelectedClassifier.java:513)
weka.classifiers.evaluation.Evaluation.crossValidateModel(Evaluation.java:843)
weka.classifiers.Evaluation.crossValidateModel(Evaluation.java:392)
weka.classifiers.meta.multisearch.DefaultEvaluationTask.doRun(DefaultEvaluationTask.java:98)
weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:113)
weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:34)
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base/java.lang.Thread.run(Thread.java:829)
at weka.filters.supervised.instance.SMOTE.doSMOTE(SMOTE.java:539)
at weka.filters.supervised.instance.SMOTE.batchFinished(SMOTE.java:489)
at weka.filters.Filter.useFilter(Filter.java:708)
at weka.classifiers.meta.FilteredClassifier.setUp(FilteredClassifier.java:719)
at weka.classifiers.meta.FilteredClassifier.buildClassifier(FilteredClassifier.java:794)
at weka.classifiers.ParallelIteratedSingleClassifierEnhancer.buildClassifiers(ParallelIteratedSingleClassifierEnhancer.java:229)
at weka.classifiers.meta.Bagging.buildClassifier(Bagging.java:709)
at weka.classifiers.meta.AttributeSelectedClassifier.buildClassifier(AttributeSelectedClassifier.java:513)
at weka.classifiers.evaluation.Evaluation.crossValidateModel(Evaluation.java:843)
at weka.classifiers.Evaluation.crossValidateModel(Evaluation.java:392)
at weka.classifiers.meta.multisearch.DefaultEvaluationTask.doRun(DefaultEvaluationTask.java:98)
at weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:113)
at weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:34)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
这是我的代码:
base_model_3 = Classifier(classname="weka.classifiers.trees.ADTree",
options=["-B", "10", "-E", "-3", "-S", "1"])
CostS_cls_model_3 = SingleClassifierEnhancer(classname="weka.classifiers.meta.CostSensitiveClassifier",
options =["-cost-matrix", "[0.0 2.0; 1.0 0.0]", "-S", "1"])
CostS_cls_model_3.classifier = base_model_3
smote_model_3 = Filter(classname="weka.filters.supervised.instance.SMOTE",
options=["-C", "0", "-K", "3", "-P", "250.0", "-S", "1"])
fc_model_3 = FilteredClassifier(options=["-S","1"])
fc_model_3.filter = smote_model_3
fc_model_3.classifier = CostS_cls_model_3
bagging_cls_model_3 = SingleClassifierEnhancer(classname="weka.classifiers.meta.Bagging",
options=["-P", "100", "-S", "1", "-num-slots", "1", "-I", "100"])
bagging_cls_model_3.classifier = fc_model_3
AttS_cls_model_3 = AttributeSelectedClassifier()
AttS_cls_model_3.search = from_commandline('weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N 61', classname=get_classname(ASSearch))
AttS_cls_model_3.evaluator = from_commandline('weka.attributeSelection.InfoGainAttributeEval', classname=get_classname(ASEvaluation))
AttS_cls_model_3.classifier = bagging_cls_model_3
multisearch_cls_model_3 = MultiSearch(options = ["-S", "1","-class-label","1"])
multisearch_cls_model_3.evaluation = "FM"
multisearch_cls_model_3.search = ["-sample-size", "100", "-initial-folds", "2", "-subsequent-folds", "10",
"-initial-test-set", ".", "-subsequent-test-set", ".", "-num-slots", "1"]
mparam_model_3 = MathParameter()
mparam_model_3.prop = "search.numToSelect"
mparam_model_3.minimum = 5.0
mparam_model_3.maximum = 134.0
mparam_model_3.step = 1.0
mparam_model_3.base = 10.0
mparam_model_3.expression = "I"
multisearch_cls_model_3.parameters = [mparam_model_3]
multisearch_cls_model_3.classifier = AttS_cls_model_3
MissingValues = Filter(classname="weka.filters.unsupervised.attribute.ReplaceMissingValues")
fc_model_3_MV = FilteredClassifier(options=["-S","1"])
fc_model_3_MV.filter = MissingValues
fc_model_3_MV.classifier = multisearch_cls_model_3
print("----------------------------------------------------------------------------")
print("CROSS VALIDATION")
print("----------------------------------------------------------------------------")
evl_model_3 = Evaluation(data_modelo_3_encoded)
evl_model_3.crossvalidate_model(fc_model_3_MV, data_modelo_3_encoded, 10, Random(1))
print(evl_model_3.summary())
conf_matrix = evl_model_3.confusion_matrix
plt.figure(figsize=(8,8))
sns.heatmap(conf_matrix, xticklabels = Labels, yticklabels = Labels, annot = True, fmt = "f", linewidth = 2)
plt.title("Confusion Matrix")
plt.ylabel("True Class")
plt.xlabel("Predicted Class")
plt.show()
print("----------------------------------------------------------------------------")
print("TEST MODEL")
print("----------------------------------------------------------------------------")
train_model_3, test_model_3 = data_modelo_3_encoded.train_test_split(70.0, Random(1))
fc_model_3_MV.build_classifier(train_model_3)
evl_model_3 = Evaluation(test_model_3)
evl_model_3.test_model(fc_model_3_MV, test_model_3)
print("")
print("=== Setup ===")
print("Classifier: ")
print(fc_model_3_MV.to_commandline())
print("----------------------------------------------------------------------------")
print("Dataset: ")
print(test_model_3.relationname)
print("----------------------------------------------------------------------------")
print("")
print(evl_model_3.summary("=== " + str(10) + " -fold Cross-Validation ==="))
print("----------------------------------------------------------------------------")
print(evl_model_3.class_details())
print("----------------------------------------------------------------------------")
plcls.plot_roc(evl_model_3, class_index=[0, 1], wait=True)
print("----------------------------------------------------------------------------")
conf_matrix = evl_model_3.confusion_matrix
plt.figure(figsize=(8,8))
sns.heatmap(conf_matrix, xticklabels = Labels, yticklabels = Labels, annot = True, fmt = "f", linewidth = 2)
plt.title("Confusion Matrix")
plt.ylabel("True Class")
plt.xlabel("Predicted Class")
plt.show()
它在“测试模型”区域之前一直正常工作,但我没有做任何更改,只是添加了一个混淆矩阵来显示结果,但不明白为什么会出现该错误。
此模型的数据库不平衡,但以前没有出现过此错误。
SMOTE 可能对每个 class 有足够的实例非常敏感(我对过滤器的内部工作原理一无所知)。在使用 UCI 数据集时,我能够重现这个问题。但是,当我将 train/test 拆分更改为 90/10(正如您的 cross-validation 使用的那样)时,它起作用了。