如何分别给定两个文件训练两组数据？

Question

我正在做一个项目，我需要根据他们手部的 X 光来估计他们的年龄。我得到了一个测试集，其中包含大量图像（在我计算机上的一个文件夹中），全部编号，并且我还得到了一个 CSV 文件，该文件对应每个图像编号和 2 条信息：年龄（以月为单位） )，以及个人是否为男性（给出“真”或“假”。另外，我相信我已经成功地将这两个文件导入 python（图像文件夹，以及CSV 文件）

我看过很多 TensorFlow 教程，但我一直在努力弄清楚如何将图像编号关联在一起，以及如何训练数据集。任何帮助将不胜感激！！

到目前为止，我已经附上了我的代码块，以及数据是如何呈现给我的。

import pandas as pd
import numpy as np
import os
import tensorflow as tf
import cv2
from tensorflow import keras
from tensorflow.keras.layers import Dense, Input, InputLayer, Flatten
from tensorflow.keras.models import Sequential, Model
from  matplotlib import pyplot as plt
import matplotlib.image as mpimg
import random 
%matplotlib inline 
import matplotlib.pyplot as plt

--这只是导入我使用的库，或预期以后使用的库。

plt.figure(figsize=(20,20))
train_images=r'/Users/FOLDER/downloads/Boneage_competition/training_dataset/boneage-training-dataset'
for i in range(5):
    file = random.choice(os.listdir(train_images))
    image_path= os.path.join(train_images, file)
    img=mpimg.imread(image_path)
    ax=plt.subplot(1,5,i+1)
    ax.title.set_text(file)
    plt.imshow(img)

-- 这成功导入了图像文件夹，并打印了 5 个随机图像以测试导入是否成功。

This screenshot provides an example of how the pictures are depicted

IMG_WIDTH=200
IMG_HEIGHT=200
img_folder=r'/Users/FOLDER/downloads/Boneage_competition/training_dataset/'

-- 我相信这会将所有图像调整为指定尺寸

label_file = '/Users/FOLDER/downloads/train.csv'

train_labels = pd.read_csv (r'/Users/FOLDER/downloads/train.csv')

print (train_labels)

-- 这成功地从 CSV 文件导入数据，并打印它，以确保它有效。

如果您对如何连接这两个数据集和训练数据有任何想法，我将不胜感激。

谢谢！

Answer 1

方法很简单，在 image_data 和标签之间创建一个映射。之后，您可以创建两个 lists/np.array 并使用它们将火车和标签信息传递给您的模型。以下代码应该有助于获得相同的结果。

import os 
import glob
dic = {}
# assuming you have .png format files else change the same into the glob statement
train_images='/Users/FOLDER/downloads/Boneage_competition/training_dataset/boneage-training-dataset'

for file in glob.glob(train_images+'/*.png'):
    b_name = os.path.basename(file).split('.')[0]
    dic[b_name] = mpimg.imread(file)


dic_label_match = {}
label_file = '/Users/FOLDER/downloads/train.csv'
train_labels = pd.read_csv (r'/Users/rkrishna/downloads/train.csv')
for i in range(len(train_labels)):
    # given your first column is age and image no starts from 1
    dic_label_match[i+1] =  str(train_labels.iloc[i][0])
    # you can use the below line too
    # dic_label_match[i+1] =  str(train_labels.iloc[i][age])

# now you have dict with keys and values 
# create two lists / arrays and you can pass the same to the keram model

train_x = []
label_ = []

for val in dic:
    if val in dic and val in dic_label_match:
        train_x.append(dic[val])
        label_.append(dic_label_match[val])

如何分别给定两个文件训练两组数据？

How do I train two sets of data given both files separately?

python

keras

tensorflow

image-classification