tflearn to_categorical:处理来自 pandas.df.values 的数据:数组的数组
tflearn to_categorical: Processing data from pandas.df.values: array of arrays
labels = np.array([['positive'],['negative'],['negative'],['positive']])
# output from pandas is similar to the above
values = (labels=='positive').astype(np.int_)
to_categorical(values,2)
输出:
array([[ 1., 1.],
[ 1., 1.],
[ 1., 1.],
[ 1., 1.]])
如果我删除包含每个元素的内部列表,它似乎工作得很好
labels = np.array([['positive'],['negative'],['negative'],['positive']])
values = (labels=='positive').astype(np.int_)
to_categorical(values.T[0],2)
输出:
array([[ 0., 1.],
[ 1., 0.],
[ 1., 0.],
[ 0., 1.]])
为什么会这样?我正在学习一些教程,但即使对于数组数组,它们似乎也获得了正确的输出。最近升级到这样了吗?
我在 py362
上使用 tflearn (0.3.2)
查看 to_categorical
:
的源代码
def to_categorical(y, nb_classes):
""" to_categorical.
Convert class vector (integers from 0 to nb_classes)
to binary class matrix, for use with categorical_crossentropy.
Arguments:
y: `array`. Class vector to convert.
nb_classes: `int`. Total number of classes.
"""
y = np.asarray(y, dtype='int32')
if not nb_classes:
nb_classes = np.max(y)+1
Y = np.zeros((len(y), nb_classes))
Y[np.arange(len(y)),y] = 1.
return Y
核心部分是高级索引Y[np.arange(len(y)),y] = 1
,它将输入向量y
作为结果数组中的列索引;所以 y
需要是一维数组才能正常工作,你通常会收到任意二维数组的广播错误:
例如:
to_categorical([[1,2,3],[2,3,4]], 2)
--------------------------------------------------------------------------- IndexError Traceback (most recent call
last) in ()
----> 1 to_categorical([[1,2,3],[2,3,4]], 2)
c:\anaconda3\envs\tensorflow\lib\site-packages\tflearn\data_utils.py
in to_categorical(y, nb_classes)
40 nb_classes = np.max(y)+1
41 Y = np.zeros((len(y), nb_classes))
---> 42 Y[np.arange(len(y)),y] = 1.
43 return Y
44
IndexError: shape mismatch: indexing arrays could not be broadcast
together with shapes (2,) (2,3)
这些方法都可以正常工作:
to_categorical(values.ravel(), 2)
array([[ 0., 1.],
[ 1., 0.],
[ 1., 0.],
[ 0., 1.]])
to_categorical(values.squeeze(), 2)
array([[ 0., 1.],
[ 1., 0.],
[ 1., 0.],
[ 0., 1.]])
to_categorical(values[:,0], 2)
array([[ 0., 1.],
[ 1., 0.],
[ 1., 0.],
[ 0., 1.]])
labels = np.array([['positive'],['negative'],['negative'],['positive']])
# output from pandas is similar to the above
values = (labels=='positive').astype(np.int_)
to_categorical(values,2)
输出:
array([[ 1., 1.],
[ 1., 1.],
[ 1., 1.],
[ 1., 1.]])
如果我删除包含每个元素的内部列表,它似乎工作得很好
labels = np.array([['positive'],['negative'],['negative'],['positive']])
values = (labels=='positive').astype(np.int_)
to_categorical(values.T[0],2)
输出:
array([[ 0., 1.],
[ 1., 0.],
[ 1., 0.],
[ 0., 1.]])
为什么会这样?我正在学习一些教程,但即使对于数组数组,它们似乎也获得了正确的输出。最近升级到这样了吗?
我在 py362
tflearn (0.3.2)
查看 to_categorical
:
def to_categorical(y, nb_classes):
""" to_categorical.
Convert class vector (integers from 0 to nb_classes)
to binary class matrix, for use with categorical_crossentropy.
Arguments:
y: `array`. Class vector to convert.
nb_classes: `int`. Total number of classes.
"""
y = np.asarray(y, dtype='int32')
if not nb_classes:
nb_classes = np.max(y)+1
Y = np.zeros((len(y), nb_classes))
Y[np.arange(len(y)),y] = 1.
return Y
核心部分是高级索引Y[np.arange(len(y)),y] = 1
,它将输入向量y
作为结果数组中的列索引;所以 y
需要是一维数组才能正常工作,你通常会收到任意二维数组的广播错误:
例如:
to_categorical([[1,2,3],[2,3,4]], 2)
--------------------------------------------------------------------------- IndexError Traceback (most recent call last) in () ----> 1 to_categorical([[1,2,3],[2,3,4]], 2)
c:\anaconda3\envs\tensorflow\lib\site-packages\tflearn\data_utils.py in to_categorical(y, nb_classes) 40 nb_classes = np.max(y)+1 41 Y = np.zeros((len(y), nb_classes)) ---> 42 Y[np.arange(len(y)),y] = 1. 43 return Y 44
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (2,) (2,3)
这些方法都可以正常工作:
to_categorical(values.ravel(), 2)
array([[ 0., 1.],
[ 1., 0.],
[ 1., 0.],
[ 0., 1.]])
to_categorical(values.squeeze(), 2)
array([[ 0., 1.],
[ 1., 0.],
[ 1., 0.],
[ 0., 1.]])
to_categorical(values[:,0], 2)
array([[ 0., 1.],
[ 1., 0.],
[ 1., 0.],
[ 0., 1.]])