对于明显的模式,scikit-learn 线性回归得分非常低

Very low score with scikit-learn linear regression for obvious pattern

from random import randint,choice
from sklearn.cross_validation import train_test_split
import numpy as np
from sklearn.linear_model import LinearRegression as LR

x1 = []
for i in range(1000):
    if i%2 == 0:
        x1.append(1001)
    else:
        x1.append(999)

leng = [x for x in range(len(x1))]

a = np.array(leng).reshape(len(leng),1)
b = np.array(x1).reshape(len(leng),1)

t1,t2,y1,y2 = train_test_split(a,b)

l = LR()
l.fit(t1,y1)
print(l.score(t2,y2))
print(l.predict(t2))

相关值在线性独立轴上只有 1001 或 999。线性回归应该给这个评分 1.0;但是,我的分数低于 0。有什么想法吗?我想我一定是做错了什么。

因为ab之间没有线性关系或明显的模式。 .score 属性 给出了 R-squared 并且 R-squared 应该是 0。手动,

predictions = model.predict(t2)
rss = np.sum(np.square(predictions - y2.mean()))    
sst = np.sum(np.square(b - b.mean()))

rsquared = rss / sst; rsquared
Out[31]: 0.0040910187945010796