张量流:与批量大小相关的不兼容形状

tensorflow: incompatible shapes related to batch size

我在训练我的 tensorflow 模型时遇到问题,这似乎与批量大小有关。如果我将批量大小设置为 1,它会执行得很好。

如果我将批量大小设置为 6 并提供 13 条记录,我会收到此错误:

tensorflow.python.framework.errors_impl.InvalidArgumentError:  Incompatible shapes: [34,2] vs. [32,2]

如果我将批量大小设置为 32 并提供 64 条记录,我会收到此错误:

    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError:  Incompatible shapes: [34,2] vs. [32,2]

我上次检查它们是否必须是批量大小的倍数,但似乎不是。

我的模型输入的形状是(960, 960, 3),输出的形状是(2)。

这是我的数据生成器的代码:

class DataGenerator(tf.keras.utils.Sequence):
'Generates data for Keras'

def __init__(self,
             directory,
             collection_name,
             batch_size=32,
             target_size=(128, 128),  # width. height
             shuffle=False,
             limit=None):
    'Initialization'
    self.target_size = target_size
    self.batch_size = batch_size
    self.directory = directory

    client = MongoClient(CONNECTION_STRING)

    # Create the database for our example (we will use the same database throughout the tutorial
    db = client[DB_NAME]
    col = db[collection_name]
    captures = col.find()
    if limit is not None:
        captures = captures.limit(limit)

    self.img_paths = []
    self.img_paths_wo_ext = []

    df = pd.DataFrame()

    self.count = 0

    for capture in captures:
        img_path = os.path.join(directory, capture['ImageName'])
        if os.path.exists(img_path):
            df = df.append({'ImageName': img_path, 'X': capture['X'], 'Y': capture['Y']}, ignore_index=True)
            self.img_paths.append(img_path)
            self.img_paths_wo_ext.append(os.path.splitext(img_path)[0])
        else:
            print(f"{img_path} for capture {capture['_id']} does not exist")
        self.count +=1

    df.set_index('ImageName', inplace=True)

    self.targets = df
    self.shuffle = shuffle
    self.on_epoch_end()

def __len__(self):
    'Denotes the number of batches per epoch'
    return int(np.ceil(len(self.img_paths) / self.batch_size))

def __getitem__(self, index):
    'Generate one batch of data'
    # Generate indexes of the batch
    # print(f'index: {index}, batchsize: {self.batch_size}, range:{index*self.batch_size}:{min((index+1)*self.batch_size,len(self.indexes))}, length:{self.indexes}')
    indexes = self.indexes[index*self.batch_size:min((index+1)*self.batch_size,len(self.indexes))]

    # Find list of IDs
    list_paths = [self.img_paths[k] for k in indexes]
    list_paths_wo_ext = [self.img_paths_wo_ext[k] for k in indexes]
    # Generate data
    X, y = self.__data_generation(list_paths, list_paths_wo_ext)

    return X, y

def on_epoch_end(self):
    'Updates indexes after each epoch'
    self.indexes = np.arange(len(self.img_paths))
    if self.shuffle == True:
        np.random.shuffle(self.indexes)

def __data_generation(self, list_paths, list_paths_wo_ext):
    'Generates data containing batch_size samples' # X : (n_samples, *dim, n_channels)
    # Initialization
    x = np.empty((self.batch_size, self.target_size[1], self.target_size[0], 3))

    # print(list_paths)
    # print(self.targets)

    y = self.targets.loc[list_paths].values

    # Generate data
    for i, ID in enumerate(list_paths):
        size = None
        resize_cache_path = f'{ID}.resized.{self.target_size[0]}x{self.target_size[1]}.png'
        resized = None  # type: Image
        # Store sample
        img = Image.open(ID)  # type: Image
        try:
            img.load()  # required for png.split()
        except BaseException as ex:
            raise Exception(f'Error loading PNG \'{ID}\': {str(ex)}')
        if size is not None:
            raise Exception(f'Image already loaded for ID: {ID}, paths: {list_paths}, size: {size}')
        size = img.size

        if os.path.isfile(resize_cache_path):
            resized = Image.open(resize_cache_path)
            resized.load()
        else:
            resized = img.resize(self.target_size)
            resized.save(resize_cache_path)
        x[i, ] = resized

        y[i][0] = (y[i][0] / size[0]) * self.target_size[0]
        y[i][1] = (y[i][1] / size[1]) * self.target_size[1]

    return x, y

我做错了什么?

原来有两个问题。

首先是 numpy 数组的初始化,它需要限制在最后一批输入的剩余长度:

x = np.empty((min(self.batch_size, len(list_paths)), self.target_size[1], self.target_size[0], 3))

其次,我的输入确实有重复项,但已被删除。