如何使用 "Federated learning" 根据客户数量将数据集拆分为训练和测试
How to split the dataset into train and test based on client number using "Federated learning"
我尝试使用 np.array.split
将数据集分成两部分,但效果不佳
希望有人能就这个问题给点建议
x` (images tensor) and `y` (labels) should have the same length. Found: x.shape = (14218, 32, 32, 3), y.shape = (2, 7109, 10)
代码部分
y_train = utils.to_categorical(y_train_data, number_of_classes) # one-hot encoding
y_test = utils.to_categorical(y_test_data, number_of_classes) # one-hot encoding
# 查看一个类别样本
print('对应类别为7\n', y_train[1])
'''clients_num = 2
X_train = np.array_split(X_train, clients_num)
y_train = np.array_split(y_train, clients_num)
print(np.shape(y_train))'''
input_shape = (img_rows, img_cols, 1)
rgb_batch = np.repeat(X_train_data[..., np.newaxis], 3, -1)
rgb_batch1 = np.repeat(X_test_data[..., np.newaxis], 3, -1)
X_train = tf.image.resize(rgb_batch, (32, 32))
X_test = tf.image.resize(rgb_batch1, (32, 32))
tf.dtypes.cast(X_train, tf.float32)
tf.dtypes.cast(X_test, tf.float32)
X_train /= 255.0
X_test /= 255.0
如果我理解正确,你有 X_train
和 Y_train
是代表你的数据集的 numpy 数组。如果你想把它分成随机的部分,你可以打乱数据集,然后将第一个打乱的部分用于第一个客户端,第二个打乱的部分用于第二个客户端:
rand_indexes= np.arange(len(X_train))
np.random.shuffle(rand_indexes)
X_rand = X_train[rand_indexes]
Y_rand = Y_train[rand_indexes]
X_1_train = X_rand[0:num_samples_1]
Y_1_train = Y_rand[0:num_samples_1]
X_2_train = X_rand[num_samples_1:]
Y_2_train = Y_rand[num_samples_1:]
我尝试使用 np.array.split
将数据集分成两部分,但效果不佳
希望有人能就这个问题给点建议
x` (images tensor) and `y` (labels) should have the same length. Found: x.shape = (14218, 32, 32, 3), y.shape = (2, 7109, 10)
代码部分
y_train = utils.to_categorical(y_train_data, number_of_classes) # one-hot encoding
y_test = utils.to_categorical(y_test_data, number_of_classes) # one-hot encoding
# 查看一个类别样本
print('对应类别为7\n', y_train[1])
'''clients_num = 2
X_train = np.array_split(X_train, clients_num)
y_train = np.array_split(y_train, clients_num)
print(np.shape(y_train))'''
input_shape = (img_rows, img_cols, 1)
rgb_batch = np.repeat(X_train_data[..., np.newaxis], 3, -1)
rgb_batch1 = np.repeat(X_test_data[..., np.newaxis], 3, -1)
X_train = tf.image.resize(rgb_batch, (32, 32))
X_test = tf.image.resize(rgb_batch1, (32, 32))
tf.dtypes.cast(X_train, tf.float32)
tf.dtypes.cast(X_test, tf.float32)
X_train /= 255.0
X_test /= 255.0
如果我理解正确,你有 X_train
和 Y_train
是代表你的数据集的 numpy 数组。如果你想把它分成随机的部分,你可以打乱数据集,然后将第一个打乱的部分用于第一个客户端,第二个打乱的部分用于第二个客户端:
rand_indexes= np.arange(len(X_train))
np.random.shuffle(rand_indexes)
X_rand = X_train[rand_indexes]
Y_rand = Y_train[rand_indexes]
X_1_train = X_rand[0:num_samples_1]
Y_1_train = Y_rand[0:num_samples_1]
X_2_train = X_rand[num_samples_1:]
Y_2_train = Y_rand[num_samples_1:]