CatBoost——在网格搜索中抑制迭代结果

Question

我正在尝试使用 CatBoost 分类器。使用它，我使用 randomised_search() 方法执行网格搜索。不幸的是，该方法将针对为每个尝试的模型构建的每棵树打印到标准输出迭代结果。

有一个参数应该控制它：verbose。理想情况下，verbose 可以设置为 False 以禁止所有标准输出打印，或设置为整数，指定报告的模型之间的间隔（模型，无树）。

你知道怎么控制吗？我在日志文件中得到数百万行...

这个问题在某种程度上与有关，但那个问题与 fit() 方法有关，该方法有一个 logging_level，也有静默参数。另一种方法，cv() 交叉验证，响应 logging_level='Silent' 切断所有输出。

Answer 1

在实例化模型时设置 logging_level='Silent'，在运行随机搜索应抑制所有输出时设置 verbose=False。

import catboost
from sklearn.datasets import make_classification
from scipy import stats

# generate some data
X, y = make_classification(n_features=10)

# instantiate the model with logging_level='Silent'
model = catboost.CatBoostClassifier(iterations=1000, logging_level='Silent')

pool = catboost.Pool(X, y)

parameters = {
    'learning_rate': stats.uniform(0.01, 0.1),
    'depth': stats.binom(n=10, p=0.2)
}

# run random search with verbose=False
randomized_search_results = model.randomized_search(
    parameters,
    pool,
    n_iter=10,
    shuffle=False,
    plot=False,
    verbose=False,
)

CatBoost——在网格搜索中抑制迭代结果

CatBoost -- suppressing iteration results in a grid search

python

catboost