在 python 中保存多个不同的多项式回归对象
Saving multiple different polynomial regression objects in python
我正在尝试生成不同次数的多项式回归并保存模型对象。
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import PolynomialFeatures
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error as MSE
df = pd.read_csv("kc_house_train_data(1).csv")
X = df['sqft_living'].values
Y = df['price'].values
lm = LinearRegression()
def Poly(X, degree):
poly = PolynomialFeatures(degree = degree)
poly_X = poly.fit_transform(X.reshape(-1,1))
return poly_X
models = []
for i in range(15):
poly_X = Poly(X, i+1)
model = lm.fit(poly_X, Y)
model.append(models)
其中 lm
是来自 sklearn 的 LinearRegression()。
我最终得到了一个模型列表,但所有模型都是 15 次多项式。不确定我做错了什么。
编辑:print(df[['sqft_living', 'price']].head(10).tostring())
的输出:
sqft_living price
0 1180 221900.0
1 2570 538000.0
2 770 180000.0
3 1960 604000.0
4 1680 510000.0
5 5420 1225000.0
6 1715 257500.0
7 1060 291850.0
8 1780 229500.0
9 1890 323000.0
10 3560 662500.0
您必须在每次迭代时重新创建模型。尝试:
models = []
for i in range(15):
poly_X = Poly(X, i+1)
lm = LinearRegression()
model = lm.fit(poly_X, Y)
models.append(model)
>>> models[0].n_features_in_
2
>>> models[1].n_features_in_
3
>>> models[2].n_features_in_
4
我正在尝试生成不同次数的多项式回归并保存模型对象。
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import PolynomialFeatures
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error as MSE
df = pd.read_csv("kc_house_train_data(1).csv")
X = df['sqft_living'].values
Y = df['price'].values
lm = LinearRegression()
def Poly(X, degree):
poly = PolynomialFeatures(degree = degree)
poly_X = poly.fit_transform(X.reshape(-1,1))
return poly_X
models = []
for i in range(15):
poly_X = Poly(X, i+1)
model = lm.fit(poly_X, Y)
model.append(models)
其中 lm
是来自 sklearn 的 LinearRegression()。
我最终得到了一个模型列表,但所有模型都是 15 次多项式。不确定我做错了什么。
编辑:print(df[['sqft_living', 'price']].head(10).tostring())
的输出:
sqft_living price
0 1180 221900.0
1 2570 538000.0
2 770 180000.0
3 1960 604000.0
4 1680 510000.0
5 5420 1225000.0
6 1715 257500.0
7 1060 291850.0
8 1780 229500.0
9 1890 323000.0
10 3560 662500.0
您必须在每次迭代时重新创建模型。尝试:
models = []
for i in range(15):
poly_X = Poly(X, i+1)
lm = LinearRegression()
model = lm.fit(poly_X, Y)
models.append(model)
>>> models[0].n_features_in_
2
>>> models[1].n_features_in_
3
>>> models[2].n_features_in_
4