根据 test_data 从 AutoMLRun 中获取指标

Question

我正在使用以下脚本执行 AutoML 运行，同时也通过了测试数据集

automl_settings = {
    "n_cross_validations": 10,
    "primary_metric": 'spearman_correlation',
    "enable_early_stopping": True,
    "max_concurrent_iterations": 10, 
    "max_cores_per_iteration": -1,   
    "experiment_timeout_hours": 1,
    "featurization": 'auto',
    "verbosity": logging.INFO}
automl_config = AutoMLConfig(task = 'regression',
                             debug_log = 'automl_errors.log',
                             compute_target = compute_target,
                             training_data = training_data,
                             test_data = test_data,
                             label_column_name = label_column_name,
                             model_explainability = True,
                             **automl_settings                            )

Answer 1

看来您还需要根据 AutoMLConfig docs 为 test_data 指定 test_size 参数:

If this parameter or the test_size parameter are not specified then no test run will be executed automatically after model training is completed. Test data should contain both features and label column. If test_data is specified then the label_column_name parameter must be specified.

至于如何提取所述指标和预测，我想它们将与 AutoMLRun 本身相关联（而不是其中一个子运行）。

Answer 2

请注意，测试数据集支持是一项仍在 PRIVATE PREVIEW 中的功能。它可能会在 11 月晚些时候作为 PUBLIC PREVIEW 发布，但在此之前，您需要注册 PRIVATE PREVIEW 才能在 UI。你可以给我发一封电子邮件到 microsoft dot com 的 cesardl，并把你的 AZURE 订阅 ID 发给我，以便你在 UI.

中看到它

您可以在此处查看有关如何开始的更多信息： https://github.com/Azure/automl-testdataset-preview

关于如何使用它，您需要提供 test_Data（特定的测试 AML 表格数据集，例如您从文件 os 之前手动拆分加载的）或者您可以提供一个 test_size，它是要从 single/original 数据集中拆分的 %（即 0.2 是 20%）。

关于 TEST 指标，由于您可以针对单个模型进行多个 TEST 运行，因此您需要转到 link“测试”下可用的特定 TEST 运行结果

enter image description here

根据 test_data 从 AutoMLRun 中获取指标

get metrics out of AutoMLRun based on test_data

azure-machine-learning-service