在馈送到卷积网络中的密集连接分类器之前从图像中提取特征和标签

Extracting features and labels from images before feeding to a densely connected classifier in Convolutional Network

我正在尝试从图像中提取特征和标签,然后将它们提供给密集连接的分类器 VGG16。 下面给出提取特征的函数

from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(rescale=1./255)
batch_size = 30
def extract_features(dataframe,directory, sample_count,x_col,y_col):
  features = np.zeros(shape=(sample_count, 4, 4, 512))
  labels = np.zeros(shape=(sample_count))
  generator = datagen.flow_from_dataframe(dataframe,directory,
  x_col,
  y_col,
  target_size=(150, 150),batch_size=batch_size,class_mode='raw')
  i = 0
  for inputs_batch, labels_batch in generator:
    features_batch = conv_base.predict(inputs_batch)
    features[i * batch_size : (i + 1) * batch_size] = features_batch
    labels[i * batch_size : (i + 1) * batch_size] = labels_batch
    i += 1
    if i * batch_size >= sample_count:
      break
  return features, labels

但是当我尝试

train_features, train_labels = extract_features(dataframe=combined[:790],directory=directory1,x_col='file_name',y_col=target_columns,sample_count=790)
#validation_features, validation_labels = extract_features(combined[790:1002],directory=directory1, sample_count=212)

我收到以下错误。

Found 790 validated image filenames.

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-44-9dd7d3b4ac22> in <module>()
----> 1 train_features, train_labels = extract_features(dataframe=combined[:790],directory=directory1,x_col='file_name',y_col=target_columns,sample_count=790)

<ipython-input-42-c13d7901073d> in extract_features(dataframe, directory, sample_count, x_col, y_col)
     14     features_batch = conv_base.predict(inputs_batch)
     15     features[i * batch_size : (i + 1) * batch_size] = features_batch
---> 16     labels[i * batch_size : (i + 1) * batch_size] = labels_batch
     17     i += 1
     18     if i * batch_size >= sample_count:

ValueError: could not broadcast input array from shape (30,63) into shape (30)

需要注意的是下面给出了我的数据的label和data batch shape

for data_batch, labels_batch in train_generator:
  print('data batch shape:', data_batch.shape)
  print('labels batch shape:', labels_batch.shape)
  break

数据批形状:(32, 150, 150, 3) 标签批量形状:(32、63) 我应用了一个热 encoding.The 数据框总共有 64 columns.The 第一列是 "feature_name" 这是 X 列,其余 63 列是目标

In [72]:

combined.columns

Out[72]:

Index(['file_name', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10',
       '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22',
       '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34',
       '35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46',
       '47', '48', '49', '50', '51', '52', '53', '54', '55', '58', '60', '61',
       '62', '63', '67', '69'],
      dtype='object')

在您的 extract_features 函数中,尝试以这种方式初始化标签数组:

labels = np.zeros(shape=(sample_count, len(y_col)))