套索回归:连续重步函数

Lasso Regression: The continuous heavy step function

从很多文档中,我了解到岭回归的秘诀是:

loss_Ridge = loss_function + lambda x L2 norm of slope

Lasso 回归的秘诀是:

loss_Lasso = loss_function + lambda x L1 norm of slope

当我阅读“TensorFlow Machine Learning Cookbook”中的主题 "Implementing Lasso and Ridge Regression" 时,其作者解释说:

"...we will use a continuous approximation to a step function, called the continuous heavy step function..."

其作者还提供了代码行here。 我不明白在这种情况下哪个叫做“连续重步函数”。请帮助我。

根据您提供的link,

if regression_type == 'LASSO':
    # Declare Lasso loss function
    # Lasso Loss = L2_Loss + heavyside_step,
    # Where heavyside_step ~ 0 if A < constant, otherwise ~ 99
    lasso_param = tf.constant(0.9)
    heavyside_step = tf.truediv(1., tf.add(1., tf.exp(tf.multiply(-50., tf.subtract(A, lasso_param)))))
    regularization_param = tf.multiply(heavyside_step, 99.)
loss = tf.add(tf.reduce_mean(tf.square(y_target - model_output)), regularization_param)

这个 heavyside_step 函数非常接近逻辑函数,后者又可以是阶跃函数的连续逼近。

您使用连续逼近,因为损失函数需要相对于您的模型参数是可微的。

要获得有关阅读 https://www.cs.ubc.ca/~schmidtm/Documents/2005_Notes_Lasso.pdf

中约束公式部分 1.6 的直觉

您可以在您的代码中看到,如果 A < 0.9,则 regularization_param 消失,因此优化会将 A 限制在该范围内。

如果你想在这里使用套索回归规范化特征,你有一个例子:

from sklearn.feature_selection import SelectFromModel
from sklearn.linear_model import Lasso
estimator = Lasso()
featureSelection = SelectFromModel(estimator)
featureSelection.fit(features_vector, target)
selectedFeatures = featureSelection.transform(features_vector)
print(selectedFeatures)