LinearRegression() 和 Ridge(alpha=0) 的区别
Difference between LinearRegression() and Ridge(alpha=0)
当 alpha 参数接近零时,Tikhonov(脊)成本等于最小二乘成本。 scikit-learn docs about the subject 上的所有内容都表示相同。因此我预计
sklearn.linear_model.Ridge(alpha=1e-100).fit(data, target)
相当于
sklearn.linear_model.LinearRegression().fit(data, target)
但事实并非如此。为什么?
更新代码:
import pandas as pd
from sklearn.linear_model import Ridge, LinearRegression
from sklearn.preprocessing import PolynomialFeatures
import matplotlib.pyplot as plt
%matplotlib inline
dataset = pd.read_csv('house_price_data.csv')
X = dataset['sqft_living'].reshape(-1, 1)
Y = dataset['price'].reshape(-1, 1)
polyX = PolynomialFeatures(degree=15).fit_transform(X)
model1 = LinearRegression().fit(polyX, Y)
model2 = Ridge(alpha=1e-100).fit(polyX, Y)
plt.plot(X, Y,'.',
X, model1.predict(polyX),'g-',
X, model2.predict(polyX),'r-')
注意: alpha=1e-8
或 alpha=1e-100
的情节看起来相同
根据documentation,alpha
必须是正浮点数。您的示例将 alpha=0
作为整数。使用小的正数 alpha
,Ridge
和 LinearRegression
的结果似乎收敛了。
from sklearn.linear_model import Ridge, LinearRegression
data = [[0, 0], [1, 1], [2, 2]]
target = [0, 1, 2]
ridge_model = Ridge(alpha=1e-8).fit(data, target)
print("RIDGE COEFS: " + str(ridge_model.coef_))
ols = LinearRegression().fit(data,target)
print("OLS COEFS: " + str(ols.coef_))
# RIDGE COEFS: [ 0.49999999 0.50000001]
# OLS COEFS: [ 0.5 0.5]
#
# VS. with alpha=0:
# RIDGE COEFS: [ 1.57009246e-16 1.00000000e+00]
# OLS COEFS: [ 0.5 0.5]
更新
上面 alpha=0
和 int
的问题似乎只是一些玩具问题的问题,比如上面的例子。
对于住房数据,问题之一是缩放。您调用的 15 次多项式导致数值溢出。要从 LinearRegression
和 Ridge
产生相同的结果,请先尝试缩放数据:
import pandas as pd
from sklearn.linear_model import Ridge, LinearRegression
from sklearn.preprocessing import PolynomialFeatures, scale
dataset = pd.read_csv('house_price_data.csv')
# scale the X data to prevent numerical errors.
X = scale(dataset['sqft_living'].reshape(-1, 1))
Y = dataset['price'].reshape(-1, 1)
polyX = PolynomialFeatures(degree=15).fit_transform(X)
model1 = LinearRegression().fit(polyX, Y)
model2 = Ridge(alpha=0).fit(polyX, Y)
print("OLS Coefs: " + str(model1.coef_[0]))
print("Ridge Coefs: " + str(model2.coef_[0]))
#OLS Coefs: [ 0.00000000e+00 2.69625315e+04 3.20058010e+04 -8.23455994e+04
# -7.67529485e+04 1.27831360e+05 9.61619464e+04 -8.47728622e+04
# -5.67810971e+04 2.94638384e+04 1.60272961e+04 -5.71555266e+03
# -2.10880344e+03 5.92090729e+02 1.03986456e+02 -2.55313741e+01]
#Ridge Coefs: [ 0.00000000e+00 2.69625315e+04 3.20058010e+04 -8.23455994e+04
# -7.67529485e+04 1.27831360e+05 9.61619464e+04 -8.47728622e+04
# -5.67810971e+04 2.94638384e+04 1.60272961e+04 -5.71555266e+03
# -2.10880344e+03 5.92090729e+02 1.03986456e+02 -2.55313741e+01]
当 alpha 参数接近零时,Tikhonov(脊)成本等于最小二乘成本。 scikit-learn docs about the subject 上的所有内容都表示相同。因此我预计
sklearn.linear_model.Ridge(alpha=1e-100).fit(data, target)
相当于
sklearn.linear_model.LinearRegression().fit(data, target)
但事实并非如此。为什么?
更新代码:
import pandas as pd
from sklearn.linear_model import Ridge, LinearRegression
from sklearn.preprocessing import PolynomialFeatures
import matplotlib.pyplot as plt
%matplotlib inline
dataset = pd.read_csv('house_price_data.csv')
X = dataset['sqft_living'].reshape(-1, 1)
Y = dataset['price'].reshape(-1, 1)
polyX = PolynomialFeatures(degree=15).fit_transform(X)
model1 = LinearRegression().fit(polyX, Y)
model2 = Ridge(alpha=1e-100).fit(polyX, Y)
plt.plot(X, Y,'.',
X, model1.predict(polyX),'g-',
X, model2.predict(polyX),'r-')
注意: alpha=1e-8
或 alpha=1e-100
根据documentation,alpha
必须是正浮点数。您的示例将 alpha=0
作为整数。使用小的正数 alpha
,Ridge
和 LinearRegression
的结果似乎收敛了。
from sklearn.linear_model import Ridge, LinearRegression
data = [[0, 0], [1, 1], [2, 2]]
target = [0, 1, 2]
ridge_model = Ridge(alpha=1e-8).fit(data, target)
print("RIDGE COEFS: " + str(ridge_model.coef_))
ols = LinearRegression().fit(data,target)
print("OLS COEFS: " + str(ols.coef_))
# RIDGE COEFS: [ 0.49999999 0.50000001]
# OLS COEFS: [ 0.5 0.5]
#
# VS. with alpha=0:
# RIDGE COEFS: [ 1.57009246e-16 1.00000000e+00]
# OLS COEFS: [ 0.5 0.5]
更新
上面 alpha=0
和 int
的问题似乎只是一些玩具问题的问题,比如上面的例子。
对于住房数据,问题之一是缩放。您调用的 15 次多项式导致数值溢出。要从 LinearRegression
和 Ridge
产生相同的结果,请先尝试缩放数据:
import pandas as pd
from sklearn.linear_model import Ridge, LinearRegression
from sklearn.preprocessing import PolynomialFeatures, scale
dataset = pd.read_csv('house_price_data.csv')
# scale the X data to prevent numerical errors.
X = scale(dataset['sqft_living'].reshape(-1, 1))
Y = dataset['price'].reshape(-1, 1)
polyX = PolynomialFeatures(degree=15).fit_transform(X)
model1 = LinearRegression().fit(polyX, Y)
model2 = Ridge(alpha=0).fit(polyX, Y)
print("OLS Coefs: " + str(model1.coef_[0]))
print("Ridge Coefs: " + str(model2.coef_[0]))
#OLS Coefs: [ 0.00000000e+00 2.69625315e+04 3.20058010e+04 -8.23455994e+04
# -7.67529485e+04 1.27831360e+05 9.61619464e+04 -8.47728622e+04
# -5.67810971e+04 2.94638384e+04 1.60272961e+04 -5.71555266e+03
# -2.10880344e+03 5.92090729e+02 1.03986456e+02 -2.55313741e+01]
#Ridge Coefs: [ 0.00000000e+00 2.69625315e+04 3.20058010e+04 -8.23455994e+04
# -7.67529485e+04 1.27831360e+05 9.61619464e+04 -8.47728622e+04
# -5.67810971e+04 2.94638384e+04 1.60272961e+04 -5.71555266e+03
# -2.10880344e+03 5.92090729e+02 1.03986456e+02 -2.55313741e+01]