在测试数据加载器中创建大文件名字典,并将其中所有 512x512 补丁的预测分配为其值列表
creating a dictionary of large file names in test dataloader and assigning the prediction of all 512x512 patches in it as a list for its values
我不确定为什么按以下方式制作字典没有创建所需的输出。我最终没有得到一个包含 887 个大文件名的字典,而是得到了一个只有 2 个大文件名的字典。
快速介绍我的测试集。我有大图像,并将它们平铺成 512x512 的补丁。下面你可以看到每个正面和负面标签的大图像和 512x512 块的数量:
--test
---pos_label 14, 11051
---neg_label 74, 45230
sample_fnames_labels = dataloaders_dict['test'].dataset.samples
test_large_images = {}
test_loss = 0.0
test_acc = 0
with torch.no_grad():
test_running_loss = 0.0
test_running_corrects = 0
print(len(dataloaders_dict['test']))
for i, (inputs, labels) in enumerate(dataloaders_dict['test']):
patch_name = sample_fname.split('/')[-1]
large_image_name = patch_name.split('_')[0]
test_inputs = inputs.to(device)
test_labels = labels.to(device)
test_outputs = saved_model_ft(test_inputs)
_, test_preds = torch.max(test_outputs, 1)
max_bs = len(test_preds)
for j in range(max_bs):
sample_file_name = sample_fnames_labels[i+j][0]
patch_name = sample_file_name.split('/')[-1]
large_image_name = patch_name.split('_')[0]
if large_image_name not in test_large_images.keys():
test_large_images[large_image_name] = list()
test_large_images[large_image_name].append(test_preds[j].item())
else:
test_large_images[large_image_name].append(test_preds[j].item())
#test_running_loss += test_loss.item() * test_inputs.size(0)
test_running_corrects += torch.sum(test_preds == test_labels.data)
#test_loss = test_running_loss / len(dataloaders_dict['test'].dataset)
test_acc = test_running_corrects / len(dataloaders_dict['test'].dataset)
这里test_large_images字典只有两张大图作为键,而不是88张测试大图。感谢观看。
本质上,我想将每个大图像的 512x512 块的所有标签作为列表收集到以 large_image_filename 为键的字典中。所以,我可以稍后进行多数投票。
这是 PyTorch 中使用的数据加载器,批量大小为 512。
# Create training and validation datasets
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['train', 'val', 'test']}
# Create training and validation dataloaders
print('batch size: ', batch_size)
dataloaders_dict = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size, shuffle=True, num_workers=4) for x in ['train', 'val', 'test']}
最终,我希望得到这样的东西:
{large_image_1: [0, 1, 1, 0], large_image_2: [1, 1, 1, 0, 0 , 0 , 0, 0, 0], large_image_3: [0, 0], ...}
请注意,我的大图片在 512x512 补丁数量方面有不同的尺寸。
我确实在下面看到了 87 个独特的大图像文件名。不知道为什么在字典中只有两个被更新:
fnames = set()
for i in range(len(sample_fnames_labels)):
fname = sample_fnames_labels[i][0].split('/')[-1][:23]
fnames.add(fname)
print(len(fnames))
87
通过在测试的数据加载器中将批量大小设置为 1 解决了问题
# Create training and validation datasets
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['test']}
# Create training and validation dataloaders
dataloaders_dict = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=1, shuffle=True, num_workers=4) for x in ['test']}
test_large_images = {}
test_loss = 0.0
test_acc = 0
with torch.no_grad():
test_running_loss = 0.0
test_running_corrects = 0
print(len(dataloaders_dict['test']))
for i, (inputs, labels) in enumerate(dataloaders_dict['test']):
print(i)
test_input = inputs.to(device)
test_label = labels.to(device)
test_output = saved_model_ft(test_input)
_, test_pred = torch.max(test_output, 1)
sample_fname, label = dataloaders_dict['test'].dataset.samples[i]
patch_name = sample_fname.split('/')[-1]
large_image_name = patch_name.split('_')[0]
if large_image_name not in test_large_images.keys():
test_large_images[large_image_name] = list()
test_large_images[large_image_name].append(test_pred.item())
else:
test_large_images[large_image_name].append(test_pred.item())
#print('test_large_images.keys(): ', test_large_images.keys())
test_running_corrects += torch.sum(test_preds == test_labels.data)
test_acc = test_running_corrects / len(dataloaders_dict['test'].dataset)
print(test_acc)
我不确定为什么按以下方式制作字典没有创建所需的输出。我最终没有得到一个包含 887 个大文件名的字典,而是得到了一个只有 2 个大文件名的字典。
快速介绍我的测试集。我有大图像,并将它们平铺成 512x512 的补丁。下面你可以看到每个正面和负面标签的大图像和 512x512 块的数量:
--test
---pos_label 14, 11051
---neg_label 74, 45230
sample_fnames_labels = dataloaders_dict['test'].dataset.samples
test_large_images = {}
test_loss = 0.0
test_acc = 0
with torch.no_grad():
test_running_loss = 0.0
test_running_corrects = 0
print(len(dataloaders_dict['test']))
for i, (inputs, labels) in enumerate(dataloaders_dict['test']):
patch_name = sample_fname.split('/')[-1]
large_image_name = patch_name.split('_')[0]
test_inputs = inputs.to(device)
test_labels = labels.to(device)
test_outputs = saved_model_ft(test_inputs)
_, test_preds = torch.max(test_outputs, 1)
max_bs = len(test_preds)
for j in range(max_bs):
sample_file_name = sample_fnames_labels[i+j][0]
patch_name = sample_file_name.split('/')[-1]
large_image_name = patch_name.split('_')[0]
if large_image_name not in test_large_images.keys():
test_large_images[large_image_name] = list()
test_large_images[large_image_name].append(test_preds[j].item())
else:
test_large_images[large_image_name].append(test_preds[j].item())
#test_running_loss += test_loss.item() * test_inputs.size(0)
test_running_corrects += torch.sum(test_preds == test_labels.data)
#test_loss = test_running_loss / len(dataloaders_dict['test'].dataset)
test_acc = test_running_corrects / len(dataloaders_dict['test'].dataset)
这里test_large_images字典只有两张大图作为键,而不是88张测试大图。感谢观看。
本质上,我想将每个大图像的 512x512 块的所有标签作为列表收集到以 large_image_filename 为键的字典中。所以,我可以稍后进行多数投票。
这是 PyTorch 中使用的数据加载器,批量大小为 512。
# Create training and validation datasets
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['train', 'val', 'test']}
# Create training and validation dataloaders
print('batch size: ', batch_size)
dataloaders_dict = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size, shuffle=True, num_workers=4) for x in ['train', 'val', 'test']}
最终,我希望得到这样的东西:
{large_image_1: [0, 1, 1, 0], large_image_2: [1, 1, 1, 0, 0 , 0 , 0, 0, 0], large_image_3: [0, 0], ...}
请注意,我的大图片在 512x512 补丁数量方面有不同的尺寸。
我确实在下面看到了 87 个独特的大图像文件名。不知道为什么在字典中只有两个被更新:
fnames = set()
for i in range(len(sample_fnames_labels)):
fname = sample_fnames_labels[i][0].split('/')[-1][:23]
fnames.add(fname)
print(len(fnames))
87
通过在测试的数据加载器中将批量大小设置为 1 解决了问题
# Create training and validation datasets
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['test']}
# Create training and validation dataloaders
dataloaders_dict = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=1, shuffle=True, num_workers=4) for x in ['test']}
test_large_images = {}
test_loss = 0.0
test_acc = 0
with torch.no_grad():
test_running_loss = 0.0
test_running_corrects = 0
print(len(dataloaders_dict['test']))
for i, (inputs, labels) in enumerate(dataloaders_dict['test']):
print(i)
test_input = inputs.to(device)
test_label = labels.to(device)
test_output = saved_model_ft(test_input)
_, test_pred = torch.max(test_output, 1)
sample_fname, label = dataloaders_dict['test'].dataset.samples[i]
patch_name = sample_fname.split('/')[-1]
large_image_name = patch_name.split('_')[0]
if large_image_name not in test_large_images.keys():
test_large_images[large_image_name] = list()
test_large_images[large_image_name].append(test_pred.item())
else:
test_large_images[large_image_name].append(test_pred.item())
#print('test_large_images.keys(): ', test_large_images.keys())
test_running_corrects += torch.sum(test_preds == test_labels.data)
test_acc = test_running_corrects / len(dataloaders_dict['test'].dataset)
print(test_acc)