如何更新逻辑回归模型?
How to update Logistic Regression Model?
我训练了一个逻辑回归模型。现在我必须用新的训练数据集更新(部分拟合)模型。可能吗?
您不能在LogisticRegression
上使用partial_fit
。
但是你可以:
- 使用
warm_start=True
,重用之前调用fit的解作为初始化,加速收敛。
- 将
SGDClassifier
与 loss='log'
一起使用,相当于 LogisticRegression
,并且支持 partial_fit
。
注意 partial_fit
和 warm_start
之间的区别。两种方法都从以前的模型开始并更新它,但是 partial_fit
只更新了一点点模型,而 warm_start
一直在新的训练数据上收敛,忘记了以前的模型。 warm_start
只是用来加速收敛。
另见 the glossary:
warm_start
When fitting an estimator repeatedly on the same dataset, but for multiple parameter values (such as to find the value maximizing performance as in grid search), it may be possible to reuse aspects of the model learnt from the previous parameter value, saving time. When warm_start
is true, the existing fitted model attributes an are used to initialise the new model in a subsequent call to fit
.
Note that this is only applicable for some models and some parameters, and even some orders of parameter values. For example, warm_start
may be used when building random forests to add more trees to the forest (increasing n_estimators
) but not to reduce their number.
partial_fit
also retains the model between calls, but differs: with warm_start
the parameters change and the data is (more-or-less) constant across calls to fit; with partial_fit
, the mini-batch of data changes and model parameters stay fixed.
There are cases where you want to use warm_start
to fit on different, but closely related data. For example, one may initially fit to a subset of the data, then fine-tune the parameter search on the full dataset. For classification, all data in a sequence of warm_start
calls to fit
must include samples from each class.
__
partial_fit
Facilitates fitting an estimator in an online fashion. Unlike fit
, repeatedly calling partial_fit
does not clear the model, but updates it with respect to the data provided. The portion of data provided to partial_fit
may be called a mini-batch. Each mini-batch must be of consistent shape, etc.
partial_fit
may also be used for out-of-core learning, although usually limited to the case where learning can be performed online, i.e. the model is usable after each partial_fit and there is no separate processing needed to finalize the model. cluster.Birch
introduces the convention that calling partial_fit(X)
will produce a model that is not finalized, but the model can be finalized by calling partial_fit()
i.e. without passing a further mini-batch.
Generally, estimator parameters should not be modified between calls to partial_fit
, although partial_fit
should validate them as well as the new mini-batch of data. In contrast, warm_start
is used to repeatedly fit the same estimator with the same data but varying parameters.
我训练了一个逻辑回归模型。现在我必须用新的训练数据集更新(部分拟合)模型。可能吗?
您不能在LogisticRegression
上使用partial_fit
。
但是你可以:
- 使用
warm_start=True
,重用之前调用fit的解作为初始化,加速收敛。 - 将
SGDClassifier
与loss='log'
一起使用,相当于LogisticRegression
,并且支持partial_fit
。
注意 partial_fit
和 warm_start
之间的区别。两种方法都从以前的模型开始并更新它,但是 partial_fit
只更新了一点点模型,而 warm_start
一直在新的训练数据上收敛,忘记了以前的模型。 warm_start
只是用来加速收敛。
另见 the glossary:
warm_start
When fitting an estimator repeatedly on the same dataset, but for multiple parameter values (such as to find the value maximizing performance as in grid search), it may be possible to reuse aspects of the model learnt from the previous parameter value, saving time. When
warm_start
is true, the existing fitted model attributes an are used to initialise the new model in a subsequent call tofit
.Note that this is only applicable for some models and some parameters, and even some orders of parameter values. For example,
warm_start
may be used when building random forests to add more trees to the forest (increasingn_estimators
) but not to reduce their number.
partial_fit
also retains the model between calls, but differs: withwarm_start
the parameters change and the data is (more-or-less) constant across calls to fit; withpartial_fit
, the mini-batch of data changes and model parameters stay fixed.There are cases where you want to use
warm_start
to fit on different, but closely related data. For example, one may initially fit to a subset of the data, then fine-tune the parameter search on the full dataset. For classification, all data in a sequence ofwarm_start
calls tofit
must include samples from each class.
__
partial_fit
Facilitates fitting an estimator in an online fashion. Unlike
fit
, repeatedly callingpartial_fit
does not clear the model, but updates it with respect to the data provided. The portion of data provided topartial_fit
may be called a mini-batch. Each mini-batch must be of consistent shape, etc.
partial_fit
may also be used for out-of-core learning, although usually limited to the case where learning can be performed online, i.e. the model is usable after each partial_fit and there is no separate processing needed to finalize the model.cluster.Birch
introduces the convention that callingpartial_fit(X)
will produce a model that is not finalized, but the model can be finalized by callingpartial_fit()
i.e. without passing a further mini-batch.Generally, estimator parameters should not be modified between calls to
partial_fit
, althoughpartial_fit
should validate them as well as the new mini-batch of data. In contrast,warm_start
is used to repeatedly fit the same estimator with the same data but varying parameters.