sklearn OneHotEncoder 形状错误
sklearn OneHotEncoder wrong shape
我有一个数组
y_train: array([ 0, 0, 0, -1, 1, 0, -1, 0, ..., -1, 0, 1], dtype=int64)
我这样做了:
enc = OneHotEncoder()
y_train = enc.fit_transform(y_train.reshape(1,-1))
结果是
(0, 0) 1.0
(0, 1) 1.0
(0, 2) 1.0
(0, 3) 1.0
(0, 4) 1.0
(0, 5) 1.0
但我真正想要的是像下面这样进行单热编码:
[1,0,0]
[1,0,0]
[0,1,0]
[0,0,1]
.....
如何解决?
将编码应用于 y_train
变量后,您必须使用 toarray()
函数:
from sklearn import preprocessing
import numpy as np
y_train = np.array([0, 0, 0, -1, 1, 0, -1, 0, -1, 0, 1]).reshape(-1, 1)
enc = preprocessing.OneHotEncoder()
y_train = enc.fit_transform(y_train).toarray()
print(y_train)
你会得到这个输出:
[[0. 1. 0.]
[0. 1. 0.]
[0. 1. 0.]
[1. 0. 0.]
[0. 0. 1.]
[0. 1. 0.]
[1. 0. 0.]
[0. 1. 0.]
[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]
我有一个数组
y_train: array([ 0, 0, 0, -1, 1, 0, -1, 0, ..., -1, 0, 1], dtype=int64)
我这样做了:
enc = OneHotEncoder()
y_train = enc.fit_transform(y_train.reshape(1,-1))
结果是
(0, 0) 1.0
(0, 1) 1.0
(0, 2) 1.0
(0, 3) 1.0
(0, 4) 1.0
(0, 5) 1.0
但我真正想要的是像下面这样进行单热编码:
[1,0,0]
[1,0,0]
[0,1,0]
[0,0,1]
.....
如何解决?
将编码应用于 y_train
变量后,您必须使用 toarray()
函数:
from sklearn import preprocessing
import numpy as np
y_train = np.array([0, 0, 0, -1, 1, 0, -1, 0, -1, 0, 1]).reshape(-1, 1)
enc = preprocessing.OneHotEncoder()
y_train = enc.fit_transform(y_train).toarray()
print(y_train)
你会得到这个输出:
[[0. 1. 0.]
[0. 1. 0.]
[0. 1. 0.]
[1. 0. 0.]
[0. 0. 1.]
[0. 1. 0.]
[1. 0. 0.]
[0. 1. 0.]
[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]