Pytorch 功能和来自 .npy 文件的 类
Pytorch features and classes from .npy files
我是从 TensorFlow 到 Pytorch 的新手。在 tensorflow 中,我可以简单地从单独的 .npy 文件加载特征和标签,并使用它们训练 CNN。简单如下:
def finetune_resnet(file_train_classes, file_train_features, name_model_to_save):
#Lets load features and classes first
print("Loading, organizing and pre-processing features")
num_classes = 12
x_train=np.load(file_train_features)
y_train=np.load(file_train_classes)
#Defining train as 70% and validation 30% of the data
#The partition is stratified with a fixed random state
#Therefore, for all networks, the partition will be the same
x_train, x_validation, y_train, y_validation = train_test_split(x_train, y_train, test_size=0.30, stratify=y_train, random_state=42)
print("transforming to categorical")
y_train = to_categorical(y_train, num_classes)
y_validation = to_categorical(y_validation, num_classes)
y_train= tf.constant(y_train, shape=[y_train.shape[0], num_classes])
y_validation= tf.constant(y_validation, shape=[y_validation.shape[0], num_classes])
print("preprocessing data")
#Preprocessing data
x_train = x_train.astype('float32')
x_validation=x_validation.astype('float32')
x_train /= 255.
x_validation /= 255.
print("Setting up the network")
#Parameters for network training
batch_size = 32
epochs=300
sgd = SGD(lr=0.01)
trainAug = ImageDataGenerator(rotation_range=30,zoom_range=0.15,width_shift_range=0.2,height_shift_range=0.2,shear_range=0.15,horizontal_flip=True,fill_mode="nearest")
print("Compiling the network")
#Load model and prepare it for fine tuning
baseModel = ResNet50(weights="imagenet", include_top=False,
input_tensor=Input(shape=(224, 224, 3)))
# construct the head of the model that will be placed on top of the
# the base model
headModel = baseModel.output
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(512, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(num_classes, activation="softmax")(headModel)
# place the head FC model on top of the base model (this will become
# the actual model we will train)
model = Model(inputs=baseModel.input, outputs=headModel)
model.compile(loss="categorical_crossentropy", optimizer=sgd, metrics=["accuracy"])
trainAug.fit(x_train)
# Fit the model on the batches generated by datagen.flow().
print("[INFO] training head...")
H=model.fit(trainAug.flow(x_train, y_train, batch_size=batch_size), steps_per_epoch=x_train.shape[0] // batch_size, epochs=epochs, validation_data=(x_validation, y_validation), callbacks=callbacks)
但是,如果从 .npy 文件加载这些数据,我不知道如何加载、训练和评估训练和测试数据。我检查了一个从文件夹加载训练数据的 tutorial,这不是我想要的。
我如何训练和测试 RESNET-50 模型,从 imagenet 权重开始,使用 Pytorch 从 .npy 文件加载训练和测试数据?
P.s:大多数 Pytorch 训练循环需要 输入进行训练。是否可以将我的 numpy 数组中的训练数据转换为这种格式?
P.s=你可以用我的数据试试here
看来您需要创建一个 。
class MyDataSet(torch.utils.data.Dataset):
def __init__(self, x, y):
super(MyDataSet, self).__init__()
# store the raw tensors
self._x = np.load(file_train_features)
self._y = np.load(file_train_classes)
def __len__(self):
# a DataSet must know it size
return self._x.shape[0]
def __getitem__(self, index):
x = self._x[index, :]
y = self._y[index, :]
return x, y
您可以进一步使用 Dataset
方法将 MyDataSet
拆分为训练和验证(例如,使用 torch.utils.data.random_split
)。
您可能还会发现 TensorDataset
有用。
我是从 TensorFlow 到 Pytorch 的新手。在 tensorflow 中,我可以简单地从单独的 .npy 文件加载特征和标签,并使用它们训练 CNN。简单如下:
def finetune_resnet(file_train_classes, file_train_features, name_model_to_save):
#Lets load features and classes first
print("Loading, organizing and pre-processing features")
num_classes = 12
x_train=np.load(file_train_features)
y_train=np.load(file_train_classes)
#Defining train as 70% and validation 30% of the data
#The partition is stratified with a fixed random state
#Therefore, for all networks, the partition will be the same
x_train, x_validation, y_train, y_validation = train_test_split(x_train, y_train, test_size=0.30, stratify=y_train, random_state=42)
print("transforming to categorical")
y_train = to_categorical(y_train, num_classes)
y_validation = to_categorical(y_validation, num_classes)
y_train= tf.constant(y_train, shape=[y_train.shape[0], num_classes])
y_validation= tf.constant(y_validation, shape=[y_validation.shape[0], num_classes])
print("preprocessing data")
#Preprocessing data
x_train = x_train.astype('float32')
x_validation=x_validation.astype('float32')
x_train /= 255.
x_validation /= 255.
print("Setting up the network")
#Parameters for network training
batch_size = 32
epochs=300
sgd = SGD(lr=0.01)
trainAug = ImageDataGenerator(rotation_range=30,zoom_range=0.15,width_shift_range=0.2,height_shift_range=0.2,shear_range=0.15,horizontal_flip=True,fill_mode="nearest")
print("Compiling the network")
#Load model and prepare it for fine tuning
baseModel = ResNet50(weights="imagenet", include_top=False,
input_tensor=Input(shape=(224, 224, 3)))
# construct the head of the model that will be placed on top of the
# the base model
headModel = baseModel.output
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(512, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(num_classes, activation="softmax")(headModel)
# place the head FC model on top of the base model (this will become
# the actual model we will train)
model = Model(inputs=baseModel.input, outputs=headModel)
model.compile(loss="categorical_crossentropy", optimizer=sgd, metrics=["accuracy"])
trainAug.fit(x_train)
# Fit the model on the batches generated by datagen.flow().
print("[INFO] training head...")
H=model.fit(trainAug.flow(x_train, y_train, batch_size=batch_size), steps_per_epoch=x_train.shape[0] // batch_size, epochs=epochs, validation_data=(x_validation, y_validation), callbacks=callbacks)
但是,如果从 .npy 文件加载这些数据,我不知道如何加载、训练和评估训练和测试数据。我检查了一个从文件夹加载训练数据的 tutorial,这不是我想要的。
我如何训练和测试 RESNET-50 模型,从 imagenet 权重开始,使用 Pytorch 从 .npy 文件加载训练和测试数据?
P.s:大多数 Pytorch 训练循环需要
P.s=你可以用我的数据试试here
看来您需要创建一个
class MyDataSet(torch.utils.data.Dataset):
def __init__(self, x, y):
super(MyDataSet, self).__init__()
# store the raw tensors
self._x = np.load(file_train_features)
self._y = np.load(file_train_classes)
def __len__(self):
# a DataSet must know it size
return self._x.shape[0]
def __getitem__(self, index):
x = self._x[index, :]
y = self._y[index, :]
return x, y
您可以进一步使用 Dataset
方法将 MyDataSet
拆分为训练和验证(例如,使用 torch.utils.data.random_split
)。
您可能还会发现 TensorDataset
有用。