Sklearn-Pandas DataFrameMapper: mapper.fit_transform gives ValueError: bad input shape (8, 2)
Sklearn-Pandas DataFrameMapper: mapper.fit_transform gives ValueError: bad input shape (8, 2)
我能够复制 Github 存储库中给出的示例。但是,当我在自己的数据上尝试时,我得到了 ValueError。
下面是一个虚拟数据,它给出了与我的真实数据相同的错误。
import pandas as pd
import numpy as np
from sklearn_pandas import DataFrameMapper
from sklearn.preprocessing import LabelEncoder, StandardScaler, MinMaxScaler
data = pd.DataFrame({'pet':['cat', 'dog', 'dog', 'fish', 'cat', 'dog','cat','fish'], 'children': [4., 6, 3, 3, 2, 3, 5, 4], 'salary': [90, 24, 44, 27, 32, 59, 36, 27], 'feat4': ['linear', 'circle', 'linear', 'linear', 'linear', 'circle', 'circle', 'linear']})
mapper = DataFrameMapper([
(['pet', 'feat4'], LabelEncoder()),
(['children', 'salary'], [StandardScaler(),
MinMaxScaler()])
])
np.round(mapper.fit_transform(data.copy()),2)
错误如下
ValueError Traceback (most recent call last)
in ()
----> 1 np.round(mapper.fit_transform(data.copy()),2)
C:\Users\E245713\AppData\Local\Continuum\Anaconda3\lib\site-packages\sklearn\base.py in fit_transform(self, X, y, **fit_params)
453 if y is None:
454 # fit method of arity 1 (unsupervised transformation)
--> 455 return self.fit(X, **fit_params).transform(X)
456 else:
457 # fit method of arity 2 (supervised transformation)
C:\Users\E245713\AppData\Local\Continuum\Anaconda3\lib\site-packages\sklearn_pandas\dataframe_mapper.py in fit(self, X, y)
95 for columns, transformers in self.features:
96 if transformers is not None:
---> 97 transformers.fit(self._get_col_subset(X, columns))
98 return self
99
C:\Users\E245713\AppData\Local\Continuum\Anaconda3\lib\site-packages\sklearn\preprocessing\label.py in fit(self, y)
106 self : returns an instance of self.
107 """
--> 108 y = column_or_1d(y, warn=True)
109 _check_numpy_unicode_bug(y)
110 self.classes_ = np.unique(y)
C:\Users\E245713\AppData\Local\Continuum\Anaconda3\lib\site-packages\sklearn\utils\validation.py in column_or_1d(y, warn)
549 return np.ravel(y)
550
--> 551 raise ValueError("bad input shape {0}".format(shape))
552
553
ValueError: bad input shape (8, 2)
有人能帮忙吗?
谢谢
如果确实需要多个输入(例如文档中的 sklearn.decomposition.PCA(1)),您应该只将多个数组提交给一个转换。在你的情况下,错误最终来自这一行:
(['pet', 'feat4'], LabelEncoder()),
即使这样也行不通:
(['pet', 'feat4'], [LabelEncoder(), LabelEncoder()]),
你必须这样做:
mapper_good = DataFrameMapper([
(['pet'], LabelEncoder()),
(['feat4'], LabelEncoder()),
(['children'], StandardScaler()),
(['salary'], MinMaxScaler())
])
np.round(mapper_good.fit_transform(data.copy()),2)
我能够复制 Github 存储库中给出的示例。但是,当我在自己的数据上尝试时,我得到了 ValueError。
下面是一个虚拟数据,它给出了与我的真实数据相同的错误。
import pandas as pd
import numpy as np
from sklearn_pandas import DataFrameMapper
from sklearn.preprocessing import LabelEncoder, StandardScaler, MinMaxScaler
data = pd.DataFrame({'pet':['cat', 'dog', 'dog', 'fish', 'cat', 'dog','cat','fish'], 'children': [4., 6, 3, 3, 2, 3, 5, 4], 'salary': [90, 24, 44, 27, 32, 59, 36, 27], 'feat4': ['linear', 'circle', 'linear', 'linear', 'linear', 'circle', 'circle', 'linear']})
mapper = DataFrameMapper([
(['pet', 'feat4'], LabelEncoder()),
(['children', 'salary'], [StandardScaler(),
MinMaxScaler()])
])
np.round(mapper.fit_transform(data.copy()),2)
错误如下
ValueError Traceback (most recent call last) in () ----> 1 np.round(mapper.fit_transform(data.copy()),2)
C:\Users\E245713\AppData\Local\Continuum\Anaconda3\lib\site-packages\sklearn\base.py in fit_transform(self, X, y, **fit_params) 453 if y is None: 454 # fit method of arity 1 (unsupervised transformation) --> 455 return self.fit(X, **fit_params).transform(X) 456 else: 457 # fit method of arity 2 (supervised transformation)
C:\Users\E245713\AppData\Local\Continuum\Anaconda3\lib\site-packages\sklearn_pandas\dataframe_mapper.py in fit(self, X, y) 95 for columns, transformers in self.features: 96 if transformers is not None: ---> 97 transformers.fit(self._get_col_subset(X, columns)) 98 return self 99
C:\Users\E245713\AppData\Local\Continuum\Anaconda3\lib\site-packages\sklearn\preprocessing\label.py in fit(self, y) 106 self : returns an instance of self. 107 """ --> 108 y = column_or_1d(y, warn=True) 109 _check_numpy_unicode_bug(y) 110 self.classes_ = np.unique(y)
C:\Users\E245713\AppData\Local\Continuum\Anaconda3\lib\site-packages\sklearn\utils\validation.py in column_or_1d(y, warn) 549 return np.ravel(y) 550 --> 551 raise ValueError("bad input shape {0}".format(shape)) 552 553
ValueError: bad input shape (8, 2)
有人能帮忙吗?
谢谢
如果确实需要多个输入(例如文档中的 sklearn.decomposition.PCA(1)),您应该只将多个数组提交给一个转换。在你的情况下,错误最终来自这一行:
(['pet', 'feat4'], LabelEncoder()),
即使这样也行不通:
(['pet', 'feat4'], [LabelEncoder(), LabelEncoder()]),
你必须这样做:
mapper_good = DataFrameMapper([
(['pet'], LabelEncoder()),
(['feat4'], LabelEncoder()),
(['children'], StandardScaler()),
(['salary'], MinMaxScaler())
])
np.round(mapper_good.fit_transform(data.copy()),2)