Stylegan2-ada tfrecords - ValueError: axes don't match array, images will work one run and not work the next

Question

我正在通过 Google Colab 使用我从 Wikiart 抓取并转换为 1024x1024 的照片数据集训练 GAN，但在创建 tfrecords 时不断出现此错误：

Traceback (most recent call last):
  File "dataset_tool.py", line 1249, in <module>
    execute_cmdline(sys.argv)
  File "dataset_tool.py", line 1244, in execute_cmdline
    func(**vars(args))
  File "dataset_tool.py", line 714, in create_from_images
    img = img.transpose([2, 0, 1]) # HWC => CHW
ValueError: axes don't match array

我将其设置为打印出它会停止的文件，并开始从数据集中取出这些文件；但它停滞的地方似乎完全是随机的。它会在一个运行上完美地迭代一个文件，然后在下一个运行上失败，因为从数据集中取出了一些其他有问题的照片。

我不确定不断删除拖延它的照片的过程是否会 end/leave 我有一个有意义的数据集，我应该尝试修复它吗？

Answer 1

想出了解决办法，结果我抓取的一些图像是灰度的。为了解决这个问题，我使用了 imagemagick（也用于将照片调整为 1024x1024）来检查色彩空间。我将终端指向图像文件夹和运行:

magick identify *.jpg

在这里，我按 ctrl+f 键查看哪些被标记为“灰色”而不是“sRGB”。在将它们从数据集中取出后，它就像一个魅力。

Answer 2

我运行不久前进入这个领域，在花了比我愿意承认的更多时间从源集中搜索和采摘数据之后，我发现了你的问题。我什至做了 ImageMagick 搜索以从数据集中清除灰度图像，但问题仍然存在。

我什至将我的数据集导出到我们的一台 Mac，以便使用预览来批量编辑颜色和分辨率并导出新的、统一的 jpeg。还是没修好。这是我想出的解决方案。（在您的运行时工作区中更新 dataset_tool.py）

def create_from_images(tfrecord_dir, image_dir, shuffle):
    print('Loading images from "%s"' % image_dir)
    image_filenames = sorted(glob.glob(os.path.join(image_dir, '*')))
    if len(image_filenames) == 0:
        error('No input images found')

    img = np.asarray(PIL.Image.open(image_filenames[0]))
    resolution = img.shape[0]
    channels = img.shape[2] if img.ndim == 3 else 1
    if img.shape[1] != resolution:
        error('Input images must have the same width and height')
    if resolution != 2 ** int(np.floor(np.log2(resolution))):
        error('Input image resolution must be a power-of-two')
    if channels not in [1, 3]:
        error('Input images must be stored as RGB or grayscale')

    with TFRecordExporter(tfrecord_dir, len(image_filenames)) as tfr:
        order = tfr.choose_shuffled_order() if shuffle else np.arange(len(image_filenames))
        for idx in range(order.size):
            pil_img = PIL.Image.open(image_filenames[order[idx]])
            pil_img = pil_img.convert("RGB")
            img = np.asarray(pil_img)
            #print('\nimg: "%s" (%d)' % (image_filenames[order[idx]], channels))
            if channels == 1:
                img = img[np.newaxis, :, :] # HW => CHW
            else:
                img = img.transpose([2, 0, 1]) # HWC => CHW
            tfr.add_image(img)

基本上都是用PIL把图片转成RGB不管了
它确实会稍微减慢准备过程，但如果您的训练数据来自不同来源，它会很方便。

Stylegan2-ada tfrecords - ValueError: axes don't match array, images will work one run and not work the next

Stylegan2-ada tfrecords - ValueError: axes don't match array, images will work one run and not work the next

python

machine-learning

google-colaboratory

generative-adversarial-network

stylegan