在馈送到卷积网络中的密集连接分类器之前从图像中提取特征和标签
Extracting features and labels from images before feeding to a densely connected classifier in Convolutional Network
我正在尝试从图像中提取特征和标签,然后将它们提供给密集连接的分类器 VGG16。
下面给出提取特征的函数
from keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(rescale=1./255)
batch_size = 30
def extract_features(dataframe,directory, sample_count,x_col,y_col):
features = np.zeros(shape=(sample_count, 4, 4, 512))
labels = np.zeros(shape=(sample_count))
generator = datagen.flow_from_dataframe(dataframe,directory,
x_col,
y_col,
target_size=(150, 150),batch_size=batch_size,class_mode='raw')
i = 0
for inputs_batch, labels_batch in generator:
features_batch = conv_base.predict(inputs_batch)
features[i * batch_size : (i + 1) * batch_size] = features_batch
labels[i * batch_size : (i + 1) * batch_size] = labels_batch
i += 1
if i * batch_size >= sample_count:
break
return features, labels
但是当我尝试
train_features, train_labels = extract_features(dataframe=combined[:790],directory=directory1,x_col='file_name',y_col=target_columns,sample_count=790)
#validation_features, validation_labels = extract_features(combined[790:1002],directory=directory1, sample_count=212)
我收到以下错误。
Found 790 validated image filenames.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-44-9dd7d3b4ac22> in <module>()
----> 1 train_features, train_labels = extract_features(dataframe=combined[:790],directory=directory1,x_col='file_name',y_col=target_columns,sample_count=790)
<ipython-input-42-c13d7901073d> in extract_features(dataframe, directory, sample_count, x_col, y_col)
14 features_batch = conv_base.predict(inputs_batch)
15 features[i * batch_size : (i + 1) * batch_size] = features_batch
---> 16 labels[i * batch_size : (i + 1) * batch_size] = labels_batch
17 i += 1
18 if i * batch_size >= sample_count:
ValueError: could not broadcast input array from shape (30,63) into shape (30)
需要注意的是下面给出了我的数据的label和data batch shape
for data_batch, labels_batch in train_generator:
print('data batch shape:', data_batch.shape)
print('labels batch shape:', labels_batch.shape)
break
数据批形状:(32, 150, 150, 3)
标签批量形状:(32、63)
我应用了一个热 encoding.The 数据框总共有 64 columns.The 第一列是 "feature_name" 这是 X 列,其余 63 列是目标
In [72]:
combined.columns
Out[72]:
Index(['file_name', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10',
'11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22',
'23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34',
'35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46',
'47', '48', '49', '50', '51', '52', '53', '54', '55', '58', '60', '61',
'62', '63', '67', '69'],
dtype='object')
在您的 extract_features
函数中,尝试以这种方式初始化标签数组:
labels = np.zeros(shape=(sample_count, len(y_col)))
我正在尝试从图像中提取特征和标签,然后将它们提供给密集连接的分类器 VGG16。 下面给出提取特征的函数
from keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(rescale=1./255)
batch_size = 30
def extract_features(dataframe,directory, sample_count,x_col,y_col):
features = np.zeros(shape=(sample_count, 4, 4, 512))
labels = np.zeros(shape=(sample_count))
generator = datagen.flow_from_dataframe(dataframe,directory,
x_col,
y_col,
target_size=(150, 150),batch_size=batch_size,class_mode='raw')
i = 0
for inputs_batch, labels_batch in generator:
features_batch = conv_base.predict(inputs_batch)
features[i * batch_size : (i + 1) * batch_size] = features_batch
labels[i * batch_size : (i + 1) * batch_size] = labels_batch
i += 1
if i * batch_size >= sample_count:
break
return features, labels
但是当我尝试
train_features, train_labels = extract_features(dataframe=combined[:790],directory=directory1,x_col='file_name',y_col=target_columns,sample_count=790)
#validation_features, validation_labels = extract_features(combined[790:1002],directory=directory1, sample_count=212)
我收到以下错误。
Found 790 validated image filenames.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-44-9dd7d3b4ac22> in <module>()
----> 1 train_features, train_labels = extract_features(dataframe=combined[:790],directory=directory1,x_col='file_name',y_col=target_columns,sample_count=790)
<ipython-input-42-c13d7901073d> in extract_features(dataframe, directory, sample_count, x_col, y_col)
14 features_batch = conv_base.predict(inputs_batch)
15 features[i * batch_size : (i + 1) * batch_size] = features_batch
---> 16 labels[i * batch_size : (i + 1) * batch_size] = labels_batch
17 i += 1
18 if i * batch_size >= sample_count:
ValueError: could not broadcast input array from shape (30,63) into shape (30)
需要注意的是下面给出了我的数据的label和data batch shape
for data_batch, labels_batch in train_generator:
print('data batch shape:', data_batch.shape)
print('labels batch shape:', labels_batch.shape)
break
数据批形状:(32, 150, 150, 3) 标签批量形状:(32、63) 我应用了一个热 encoding.The 数据框总共有 64 columns.The 第一列是 "feature_name" 这是 X 列,其余 63 列是目标
In [72]:
combined.columns
Out[72]:
Index(['file_name', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10',
'11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22',
'23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34',
'35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46',
'47', '48', '49', '50', '51', '52', '53', '54', '55', '58', '60', '61',
'62', '63', '67', '69'],
dtype='object')
在您的 extract_features
函数中,尝试以这种方式初始化标签数组:
labels = np.zeros(shape=(sample_count, len(y_col)))