Fit 函数无法执行 reduce with flexible 类型
Fit function cannot perform reduce with flexible type
我有一个包含 42 套公寓的面积和价格的数据集。我将 python 与 databricks 一起使用,并加载了一个带有 ,
作为列分隔符的 csv 文件。后来,我将面积指定为整数,将价格指定为双精度。然后我导入图形库并进行回归:
import matplotlib.pyplot as plt
from sklearn import linear_model
后来我看了我的数据库:
aptos=sqlContext.read.format('csv').options(header='true',
interSchema='true').load('/FileStore/tables/yl3r1mgv1507304115516/aptos_dataset-5ad32.csv')
display(aptos)
使用以下几行,我使用数据库中的列创建了输入变量:
X=aptos.select("area").collect()
Y=aptos.select("precio").collect()
然后我创建回归模型:
regr = linear_model.LinearRegression()
此时我没有问题。但是当我 运行 以下行时:
regr.fit(X,Y)
我得到错误:
TypeError: cannot perform reduce with flexible type
我可以看到更多详情:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<command-2158797891361999> in <module>()
1
2
----> 3 regr.fit(X,Y)
/databricks/python/local/lib/python2.7/site-packages/sklearn/linear_model/base.pyc in fit(self, X, y, sample_weight)
517 X, y, X_offset, y_offset, X_scale = self._preprocess_data(
518 X, y, fit_intercept=self.fit_intercept, normalize=self.normalize,
--> 519 copy=self.copy_X, sample_weight=sample_weight)
520
521 if sample_weight is not None:
/databricks/python/local/lib/python2.7/site-packages/sklearn/linear_model/base.pyc in _preprocess_data(X, y, fit_intercept, normalize, copy, sample_weight, return_mean)
197 else:
198 X_scale = np.ones(X.shape[1])
--> 199 y_offset = np.average(y, axis=0, weights=sample_weight)
200 y = y - y_offset
201 else:
/databricks/python/local/lib/python2.7/site-packages/numpy/lib/function_base.pyc in average(a, axis, weights, returned)
933
934 if weights is None:
--> 935 avg = a.mean(axis)
936 scl = avg.dtype.type(a.size/avg.size)
937 else:
/databricks/python/local/lib/python2.7/site-packages/numpy/core/_methods.pyc in _mean(a, axis, dtype, out, keepdims)
63 dtype = mu.dtype('f8')
64
---> 65 ret = umr_sum(arr, axis, dtype, out, keepdims)
66 if isinstance(ret, mu.ndarray):
67 ret = um.true_divide(
TypeError: cannot perform reduce with flexible type
抱歉,我无法共享我的数据库。我是 Python 的新手,我对 R 有更多的专业知识。非常感谢您的帮助。
感谢阿卜杜。读取我的数据库时出现输入错误,这是正确的方法:
aptos=sqlContext.read.format('csv').options(header='true', inferSchema='true').load('/FileStore/tables/yl3r1mgv1507304115516/aptos_dataset-5ad32.csv')
现在回归正常:
regr.fit(X,Y)
Out[4]: LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)
我有一个包含 42 套公寓的面积和价格的数据集。我将 python 与 databricks 一起使用,并加载了一个带有 ,
作为列分隔符的 csv 文件。后来,我将面积指定为整数,将价格指定为双精度。然后我导入图形库并进行回归:
import matplotlib.pyplot as plt
from sklearn import linear_model
后来我看了我的数据库:
aptos=sqlContext.read.format('csv').options(header='true',
interSchema='true').load('/FileStore/tables/yl3r1mgv1507304115516/aptos_dataset-5ad32.csv')
display(aptos)
使用以下几行,我使用数据库中的列创建了输入变量:
X=aptos.select("area").collect()
Y=aptos.select("precio").collect()
然后我创建回归模型:
regr = linear_model.LinearRegression()
此时我没有问题。但是当我 运行 以下行时:
regr.fit(X,Y)
我得到错误:
TypeError: cannot perform reduce with flexible type
我可以看到更多详情:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<command-2158797891361999> in <module>()
1
2
----> 3 regr.fit(X,Y)
/databricks/python/local/lib/python2.7/site-packages/sklearn/linear_model/base.pyc in fit(self, X, y, sample_weight)
517 X, y, X_offset, y_offset, X_scale = self._preprocess_data(
518 X, y, fit_intercept=self.fit_intercept, normalize=self.normalize,
--> 519 copy=self.copy_X, sample_weight=sample_weight)
520
521 if sample_weight is not None:
/databricks/python/local/lib/python2.7/site-packages/sklearn/linear_model/base.pyc in _preprocess_data(X, y, fit_intercept, normalize, copy, sample_weight, return_mean)
197 else:
198 X_scale = np.ones(X.shape[1])
--> 199 y_offset = np.average(y, axis=0, weights=sample_weight)
200 y = y - y_offset
201 else:
/databricks/python/local/lib/python2.7/site-packages/numpy/lib/function_base.pyc in average(a, axis, weights, returned)
933
934 if weights is None:
--> 935 avg = a.mean(axis)
936 scl = avg.dtype.type(a.size/avg.size)
937 else:
/databricks/python/local/lib/python2.7/site-packages/numpy/core/_methods.pyc in _mean(a, axis, dtype, out, keepdims)
63 dtype = mu.dtype('f8')
64
---> 65 ret = umr_sum(arr, axis, dtype, out, keepdims)
66 if isinstance(ret, mu.ndarray):
67 ret = um.true_divide(
TypeError: cannot perform reduce with flexible type
抱歉,我无法共享我的数据库。我是 Python 的新手,我对 R 有更多的专业知识。非常感谢您的帮助。
感谢阿卜杜。读取我的数据库时出现输入错误,这是正确的方法:
aptos=sqlContext.read.format('csv').options(header='true', inferSchema='true').load('/FileStore/tables/yl3r1mgv1507304115516/aptos_dataset-5ad32.csv')
现在回归正常:
regr.fit(X,Y)
Out[4]: LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)