使用 fiftyone 加载数据集时排除某些类

Question

我正在尝试从打开的图像中获取一堆图像以用于训练对象检测 classifier。我发现从打开的图像中获取图像的最简单方法可能是使用 python 程序 FiftyOne。使用 FiftyOne，我可以通过在命令中指定 class 下载属于特定 class 的图像。

我现在的问题是如何排除某些 classes？

我想训练一个 class 识别车牌的机器。对于训练过程，我需要正面和负面的示例图像。
因为我想识别车牌而不是车辆，所以我想获得带有车辆的负面示例。

我的想法是从class“Car”中获取反例，但它们不应该是class“Vehicle registration plate”的一部分。

有没有办法告诉 FiftyOne 的创建命令它不应该包含带有 class“车辆牌照”的图像？

我目前使用的命令如下：
dataset = foz.load_zoo_dataset("open-images-v6", split="train", classes="Car", max_samples=10000)
但是，这会下载也属于 class 我不想要的“车辆牌照”的图像。

除了获取训练数据外，我不想将 FiftyOne 用于任何其他用途。

尽管它不应该与这个问题有任何关系：
我将使用 OpenCV 进行训练并使用 classifier.

Answer 1

下载汽车图片后，您可以使用filtering capabilities of FiftyOne to separate out the positive and negative examples for your task. There is no way to specifically exclude classes when downloading a dataset from the FiftyOne Zoo。

Open Images 提供样本级 positive 和 negative 标签，指示 class 是否确实存在于样本中。但是，这些并未详尽标记，因此如果 class 不存在于任何一个样本级注释中，则无法知道它是否存在。

因此，有几种方法可以为您的任务获取所有相关样本。

1) 仅使用带有注释车辆牌照的样本

from fiftyone import ViewField as F

class_name = "Vehicle registration plate"

# Find samples that have a "Vehicle registration plate"
pos_view = dataset.filter_labels("positive_labels", F("label")==class_name)

# Find all samples that don't have a "Vehicle registration plate"
neg_view = dataset.filter_labels("negative_labels", F("label")==class_name)

这是获得样品的最快方法，您可以确定是否有牌照。但是，您将丢弃未标注车牌的样本。

2) 手动过滤掉未标记的样本

如果您需要尽可能多的数据，那么您可以手动浏览未标注车牌的样本，并找到更多的反例。

from fiftyone import ViewField as F

class_name = "Vehicle registration plate"

# Find samples that have a "Vehicle registration plate"
pos_view = dataset.filter_labels("positive_labels", F("label")==class_name)

# Find all samples without a positively labeled "Vehicle registration plate"
neg_view = dataset.exclude(pos_view)

从这里启动 FiftyOne App 并标记所有有板的样本。

# Tag any samples that have a plate in the App with "remove"
session = fo.launch_app(view=neg_view)

# Find and remove all tagged samples from the DatasetView
neg_view = neg_view.match_tags("remove", bool=False)

然后您可以将数据导出到磁盘 variety of formats to train your model. If the format you need isn't listed, you can simply iterate over your dataset and save the data manually。

neg_view.export(
    export_dir="/path/to/dir",
    dataset_type=fo.types.COCODetectionDataset,
    label_field="detections",
)

训练好模型后，我建议您使用 FiftyOne visualize/analyze your predictions 了解模型的性能，以便改进它。

使用 fiftyone 加载数据集时排除某些 类

Exclude certain classes when loading dataset with fiftyone

python

fiftyone

使用 fiftyone 加载数据集时排除某些类