如何为 azure 自定义视觉在某些图片上上传重复标签?
How to uploading duplicate tags at some picture for azure custom vision?
我对 Azure 自定义视觉有疑问。我有一个用于对象检测的自定义视觉项目。
我使用 python SDK 创建项目(参见:https://docs.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/python-tutorial-od)。
但是我在上传的过程中发现了不对的地方。
比如有一张图片,这张图片里有3个人。所以我在这张照片中标记了 3 个相同的 class “人”。但是上传后,我在自定义视觉网站上发现这张图片有1个"person"标签。
但是其他的class也可以,比如这张图也可以有"person"、"car"、"scooter"。图片上好像只能有一个相同的class。
我尝试使用python SDK(参见:https://docs.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/python-tutorial-od)上传我的图片和标签信息。
A0_tag = trainer.create_tag(project.id, "A0")
A1_tag = trainer.create_tag(project.id, "A1")
A2_tag = trainer.create_tag(project.id, "A2")
A0_image_regions={
"0001.jpg":[0.432291667,0.28125,0.080729167,0.09765625],
"0001.jpg":[0.34765625,0.385742188,0.131510417,0.135742188],
"0001.jpg":[0.479166667,0.385742188,0.130208333,0.135742188],
"0003.jpg":[0.19921875,0.158203125,0.083333333,0.099609375]
}
以上代码可以看出我在0001.jpg上传了三个"A0"class。但是在网站的GUI界面中,最后只能看到"A0"class上面有一个0001.jpg。有什么办法可以解决这个问题吗?
基于 cthrash 代码。我对代码进行了一些更改以使其工作。
这是修改后的代码:
A0_tag = trainer.create_tag(project.id, "TestA")
A1_tag = trainer.create_tag(project.id, "TestB")
A2_tag = trainer.create_tag(project.id, "TestC")
A0_image_regions = {
A0_tag.id : [
("2300.png",[0.787109375,0.079681275,0.068359375,0.876494024]),
("0920.png",[0.2109375,0.065737052,0.059570313,0.892430279]),
("0920.png",[0.291015625,0.061752988,0.05859375,0.894422311]),
]
}
A1_image_regions = {
A1_tag.id : [
("2000.png",[0.067382813,0.073705179,0.030273438,0.878486056]),
("2000.png",[0.126953125,0.075697211,0.030273438,0.878486056]),
("2000.png",[0.184570313,0.079681275,0.030273438,0.878486056]),
("2000.png",[0.232421875,0.079681275,0.030273438,0.878486056]),
],
}
A2_image_regions = {
A2_tag.id : [
("1400.png",[0.649414063,0.065737052,0.104492188,0.894422311]),
("2300.png",[0.602539063,0.061752988,0.106445313,0.892430279]),
("0920.png",[0.634765625,0.067729084,0.124023438,0.88247012]),
("0800.png",[0.579101563,0.06374502,0.04296875,0.888446215]),
],
}
regions_map = {}
for tag_id in A0_image_regions:
for filename,[x,y,w,h] in A0_image_regions[tag_id]:
regions = regions_map.get(filename,[])
regions.append(Region(tag_id=A0_tag.id, left=x, top=y, width=w, height=h))
regions_map[filename] = regions
for tag_id in A1_image_regions:
for filename,[x,y,w,h] in A1_image_regions[tag_id]:
regions = regions_map.get(filename,[])
regions.append(Region(tag_id=A1_tag.id, left=x, top=y, width=w, height=h))
regions_map[filename] = regions
for tag_id in A2_image_regions:
for filename,[x,y,w,h] in A2_image_regions[tag_id]:
regions = regions_map.get(filename,[])
regions.append(Region(tag_id=A2_tag.id, left=x, top=y, width=w, height=h))
regions_map[filename] = regions
tagged_images_with_regions = []
for filename in regions_map:
regions = regions_map[filename]
with open("<your path>" + filename, mode="rb") as image_contents:
tagged_images_with_regions.append(ImageFileCreateEntry(name=filename, contents=image_contents.read(), regions=regions))
upload_result = trainer.create_images_from_files(project.id, images=tagged_images_with_regions)
听起来你只想为一张图片中的3个人标记一个标签person
,但这没有意义,不是问题。实际上,标签是标记在图片上,而不是图片中显示人物的像素区域。
因此,标签 person
只是帮助检测在训练模型后至少有一个人的事实,而不是 car
或 scooter
。如果要检测不同的人,则需要为图片中的三个不同的人添加 person1
、person2
和 person3
三个标签。
请参阅维基页面Object detection
及其参考链接以了解有关机器学习和深度学习原理的更多详细信息。
如果您没有更改示例代码中的任何其他内容,则它会尝试使用一个边界框上传图像“0.001.jpg”三次,最后两次上传失败,因为它们与您上传的第一张图片重复。
请只上传带有三个边界框的“0.001.jpg”一次,或者先上传图片再上传三个边界框。
您已经创建了 A0_image_regions
,但只要您对任何给定图像有多个边界框,就会覆盖该键。所以那是行不通的。
但也许更重要的是,您需要将图像作为主要对象调用训练器,并将所有相关图像区域集中在一起。换句话说,在你的例子中 0001.jpg
有三个 A0
的实例,但它也可能有 A1
and/or A2
的实例,这将必须是单个 ImageFile 条目。所以我会按照以下行修改示例:
A0_tag = trainer.create_tag(project.id, "A0")
A1_tag = trainer.create_tag(project.id, "A1")
A2_tag = trainer.create_tag(project.id, "A2")
image_regions = {
A0_tag.id : [
("0001.jpg", [0.432291667,0.28125,0.080729167,0.09765625]),
("0001.jpg", [0.34765625,0.385742188,0.131510417,0.135742188]),
("0001.jpg", [0.479166667,0.385742188,0.130208333,0.135742188]),
("0003.jpg", [0.19921875,0.158203125,0.083333333,0.099609375])
],
A1_tag.id : [] # add images/bounding boxes for A1
A2_tag.id : [] # add images/bounding boxes for A2
}
regions_map = {}
for tag_id in image_regions:
for filename,[x,y,w,h] in image_regions[tag_id]:
regions = regions_map.get(filename,[])
regions.append(Region(tag_id, left=x, top=y, width=w, height=h))
regions_map[filename] = regions
tagged_images_with_regions = []
for filename in regions_map:
regions = regions_map[filename]
with open(base_image_url + filename, mode="rb") as image_contents:
tagged_images_with_regions.append(ImageFileCreateEntry(name=filename, contents=image_contents.read(), regions=regions))
upload_result = trainer.create_images_from_files(project.id, images=tagged_images_with_regions)
我对 Azure 自定义视觉有疑问。我有一个用于对象检测的自定义视觉项目。 我使用 python SDK 创建项目(参见:https://docs.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/python-tutorial-od)。 但是我在上传的过程中发现了不对的地方。 比如有一张图片,这张图片里有3个人。所以我在这张照片中标记了 3 个相同的 class “人”。但是上传后,我在自定义视觉网站上发现这张图片有1个"person"标签。 但是其他的class也可以,比如这张图也可以有"person"、"car"、"scooter"。图片上好像只能有一个相同的class。
我尝试使用python SDK(参见:https://docs.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/python-tutorial-od)上传我的图片和标签信息。
A0_tag = trainer.create_tag(project.id, "A0")
A1_tag = trainer.create_tag(project.id, "A1")
A2_tag = trainer.create_tag(project.id, "A2")
A0_image_regions={
"0001.jpg":[0.432291667,0.28125,0.080729167,0.09765625],
"0001.jpg":[0.34765625,0.385742188,0.131510417,0.135742188],
"0001.jpg":[0.479166667,0.385742188,0.130208333,0.135742188],
"0003.jpg":[0.19921875,0.158203125,0.083333333,0.099609375]
}
以上代码可以看出我在0001.jpg上传了三个"A0"class。但是在网站的GUI界面中,最后只能看到"A0"class上面有一个0001.jpg。有什么办法可以解决这个问题吗?
基于 cthrash 代码。我对代码进行了一些更改以使其工作。 这是修改后的代码:
A0_tag = trainer.create_tag(project.id, "TestA")
A1_tag = trainer.create_tag(project.id, "TestB")
A2_tag = trainer.create_tag(project.id, "TestC")
A0_image_regions = {
A0_tag.id : [
("2300.png",[0.787109375,0.079681275,0.068359375,0.876494024]),
("0920.png",[0.2109375,0.065737052,0.059570313,0.892430279]),
("0920.png",[0.291015625,0.061752988,0.05859375,0.894422311]),
]
}
A1_image_regions = {
A1_tag.id : [
("2000.png",[0.067382813,0.073705179,0.030273438,0.878486056]),
("2000.png",[0.126953125,0.075697211,0.030273438,0.878486056]),
("2000.png",[0.184570313,0.079681275,0.030273438,0.878486056]),
("2000.png",[0.232421875,0.079681275,0.030273438,0.878486056]),
],
}
A2_image_regions = {
A2_tag.id : [
("1400.png",[0.649414063,0.065737052,0.104492188,0.894422311]),
("2300.png",[0.602539063,0.061752988,0.106445313,0.892430279]),
("0920.png",[0.634765625,0.067729084,0.124023438,0.88247012]),
("0800.png",[0.579101563,0.06374502,0.04296875,0.888446215]),
],
}
regions_map = {}
for tag_id in A0_image_regions:
for filename,[x,y,w,h] in A0_image_regions[tag_id]:
regions = regions_map.get(filename,[])
regions.append(Region(tag_id=A0_tag.id, left=x, top=y, width=w, height=h))
regions_map[filename] = regions
for tag_id in A1_image_regions:
for filename,[x,y,w,h] in A1_image_regions[tag_id]:
regions = regions_map.get(filename,[])
regions.append(Region(tag_id=A1_tag.id, left=x, top=y, width=w, height=h))
regions_map[filename] = regions
for tag_id in A2_image_regions:
for filename,[x,y,w,h] in A2_image_regions[tag_id]:
regions = regions_map.get(filename,[])
regions.append(Region(tag_id=A2_tag.id, left=x, top=y, width=w, height=h))
regions_map[filename] = regions
tagged_images_with_regions = []
for filename in regions_map:
regions = regions_map[filename]
with open("<your path>" + filename, mode="rb") as image_contents:
tagged_images_with_regions.append(ImageFileCreateEntry(name=filename, contents=image_contents.read(), regions=regions))
upload_result = trainer.create_images_from_files(project.id, images=tagged_images_with_regions)
听起来你只想为一张图片中的3个人标记一个标签person
,但这没有意义,不是问题。实际上,标签是标记在图片上,而不是图片中显示人物的像素区域。
因此,标签 person
只是帮助检测在训练模型后至少有一个人的事实,而不是 car
或 scooter
。如果要检测不同的人,则需要为图片中的三个不同的人添加 person1
、person2
和 person3
三个标签。
请参阅维基页面Object detection
及其参考链接以了解有关机器学习和深度学习原理的更多详细信息。
如果您没有更改示例代码中的任何其他内容,则它会尝试使用一个边界框上传图像“0.001.jpg”三次,最后两次上传失败,因为它们与您上传的第一张图片重复。
请只上传带有三个边界框的“0.001.jpg”一次,或者先上传图片再上传三个边界框。
您已经创建了 A0_image_regions
,但只要您对任何给定图像有多个边界框,就会覆盖该键。所以那是行不通的。
但也许更重要的是,您需要将图像作为主要对象调用训练器,并将所有相关图像区域集中在一起。换句话说,在你的例子中 0001.jpg
有三个 A0
的实例,但它也可能有 A1
and/or A2
的实例,这将必须是单个 ImageFile 条目。所以我会按照以下行修改示例:
A0_tag = trainer.create_tag(project.id, "A0")
A1_tag = trainer.create_tag(project.id, "A1")
A2_tag = trainer.create_tag(project.id, "A2")
image_regions = {
A0_tag.id : [
("0001.jpg", [0.432291667,0.28125,0.080729167,0.09765625]),
("0001.jpg", [0.34765625,0.385742188,0.131510417,0.135742188]),
("0001.jpg", [0.479166667,0.385742188,0.130208333,0.135742188]),
("0003.jpg", [0.19921875,0.158203125,0.083333333,0.099609375])
],
A1_tag.id : [] # add images/bounding boxes for A1
A2_tag.id : [] # add images/bounding boxes for A2
}
regions_map = {}
for tag_id in image_regions:
for filename,[x,y,w,h] in image_regions[tag_id]:
regions = regions_map.get(filename,[])
regions.append(Region(tag_id, left=x, top=y, width=w, height=h))
regions_map[filename] = regions
tagged_images_with_regions = []
for filename in regions_map:
regions = regions_map[filename]
with open(base_image_url + filename, mode="rb") as image_contents:
tagged_images_with_regions.append(ImageFileCreateEntry(name=filename, contents=image_contents.read(), regions=regions))
upload_result = trainer.create_images_from_files(project.id, images=tagged_images_with_regions)