拟合多项式回归时形状未对齐
Shapes not aligned when fitting polynomial regression
长期听众,第一次来电...
我知道过去曾回答过类似的问题(请参阅 了解我引用的其他主题),但我仍然遇到困难。我怎样才能让我的回归适合?我的代码如下:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
#data
np.random.seed(0)
n = 15
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10
X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0)
#regression fitting
X_predict_input = np.linspace(0,10,100).reshape(-1,1)
y_train = y_train.reshape((-1,1))
X_train = X_train.reshape((-1,1))
#looping through different degree values
for i, degree in enumerate([1,3,6,9]):
poly = PolynomialFeatures(degree=degree)
X_train_poly = poly.fit_transform(X_train)
linreg = LinearRegression().fit(X_train_poly, y_train)
result[i,:] = linreg.predict(X_predict_input)
我试图解决 X_train 和 y_train 的形状问题,但在查看每个形状后,我认为 X_train_poly 是导致此错误的原因。 ..
X_train shape: (11, 1)
y_train shape: (11, 1)
X_train_poly shape: (11, 10)
各自的错误信息:
ValueError: shapes (100,1) and (2,1) not aligned: 1 (dim 1) != 2 (dim 0)
当我尝试通过以下方式解决 X_train_poly 中的形状不一致问题时...
X_train_poly = poly.fit_transform(X_train).reshape((-1,1))
...我收到此错误:
ValueError: Found input variables with inconsistent numbers of samples: [22, 11]
我在这上面花了很多时间,所以任何见解都将不胜感激!
提前谢谢你:)
我觉得问题很简单。您正在使用 PolynomialFeatures
转换为训练数据生成特征,但在预测时,您没有对输入数据应用相同的转换。
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
# data
np.random.seed(0)
n = 15
x = np.linspace(0, 10, n) + np.random.randn(n)/5
y = np.sin(x) + x/6 + np.random.randn(n)/10
X_train, X_test, y_train, y_test = train_test_split(x.reshape((-1, 1)),
y.reshape((-1, 1)),
random_state=0)
# Check data matrices are in columns
assert(X_train.shape == (11, 1))
assert(y_train.shape == (11, 1))
# Build library of polynomial features
degree = 3
poly = PolynomialFeatures(degree)
X_train_poly = poly.fit_transform(X_train)
assert(X_train_poly.shape == (11, 4))
# Fit model
linreg = LinearRegression().fit(X_train_poly, y_train)
# Make prediction
X_predict = np.linspace(0, 10, 100).reshape(-1, 1)
X_predict_poly = poly.fit_transform(X_predict)
y_predict = linreg.predict(X_predict_poly)
assert(y_predict.shape == X_predict.shape)
更新:
为避免每次进行预测时都必须应用变换带来的不便,您可能需要查看 sklearn.Pipeline:
# Using a pipeline to automate the input transformation
from sklearn.pipeline import Pipeline
poly = PolynomialFeatures(degree)
model = LinearRegression()
pipeline = Pipeline(steps=[('t', poly), ('m', model)])
linreg = pipeline.fit(X_train, y_train)
y_predict2 = linreg.predict(X_predict)
assert(np.array_equal(y_predict, y_predict2))
长期听众,第一次来电...
我知道过去曾回答过类似的问题(请参阅
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
#data
np.random.seed(0)
n = 15
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10
X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0)
#regression fitting
X_predict_input = np.linspace(0,10,100).reshape(-1,1)
y_train = y_train.reshape((-1,1))
X_train = X_train.reshape((-1,1))
#looping through different degree values
for i, degree in enumerate([1,3,6,9]):
poly = PolynomialFeatures(degree=degree)
X_train_poly = poly.fit_transform(X_train)
linreg = LinearRegression().fit(X_train_poly, y_train)
result[i,:] = linreg.predict(X_predict_input)
我试图解决 X_train 和 y_train 的形状问题,但在查看每个形状后,我认为 X_train_poly 是导致此错误的原因。 ..
X_train shape: (11, 1)
y_train shape: (11, 1)
X_train_poly shape: (11, 10)
各自的错误信息:
ValueError: shapes (100,1) and (2,1) not aligned: 1 (dim 1) != 2 (dim 0)
当我尝试通过以下方式解决 X_train_poly 中的形状不一致问题时...
X_train_poly = poly.fit_transform(X_train).reshape((-1,1))
...我收到此错误:
ValueError: Found input variables with inconsistent numbers of samples: [22, 11]
我在这上面花了很多时间,所以任何见解都将不胜感激!
提前谢谢你:)
我觉得问题很简单。您正在使用 PolynomialFeatures
转换为训练数据生成特征,但在预测时,您没有对输入数据应用相同的转换。
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
# data
np.random.seed(0)
n = 15
x = np.linspace(0, 10, n) + np.random.randn(n)/5
y = np.sin(x) + x/6 + np.random.randn(n)/10
X_train, X_test, y_train, y_test = train_test_split(x.reshape((-1, 1)),
y.reshape((-1, 1)),
random_state=0)
# Check data matrices are in columns
assert(X_train.shape == (11, 1))
assert(y_train.shape == (11, 1))
# Build library of polynomial features
degree = 3
poly = PolynomialFeatures(degree)
X_train_poly = poly.fit_transform(X_train)
assert(X_train_poly.shape == (11, 4))
# Fit model
linreg = LinearRegression().fit(X_train_poly, y_train)
# Make prediction
X_predict = np.linspace(0, 10, 100).reshape(-1, 1)
X_predict_poly = poly.fit_transform(X_predict)
y_predict = linreg.predict(X_predict_poly)
assert(y_predict.shape == X_predict.shape)
更新:
为避免每次进行预测时都必须应用变换带来的不便,您可能需要查看 sklearn.Pipeline:
# Using a pipeline to automate the input transformation
from sklearn.pipeline import Pipeline
poly = PolynomialFeatures(degree)
model = LinearRegression()
pipeline = Pipeline(steps=[('t', poly), ('m', model)])
linreg = pipeline.fit(X_train, y_train)
y_predict2 = linreg.predict(X_predict)
assert(np.array_equal(y_predict, y_predict2))