Microsoft 认知服务计算机视觉 API "Recognize Domain Specific Content" 功能

Microsoft Cognitive Services Computer Vision API "Recognize Domain Specific Content" feature

我正在尝试实施 Microsoft 认知服务计算机视觉 API "Recognize Domain Specific Content" 功能,但似乎遇到了一些困难。

无论我如何尝试提交照片(甚至使用计算机视觉将它们裁剪成只显示面部 API 先获取缩略图),我都没有得到任何名人的结果。 :-( 我尝试通过上传图像(通过编写 Java testlet)并指定图像 URL 来提交图像。None 有效。

然而,当我在http://www.celebslike.me上使用相同张照片时,结果确实显示了名人。

(我什至使用了一些取自 http://www.celebslike.me 本身的 样本 ,虽然它确实在 http://www.celebslike.me 网站上显示了结果,但当我试着手动调用 API。)

我总是得到这样的结果:

{
"requestId": "278d8ed0-79dc-4817-8329-b8440c650f9b",
"metadata": {
"width": 250,
"height": 250,
"format": "Jpeg"
},
"result": {
"celebrities": []
}
}

...' "celebrities": [] ' - 列表中没有名人,即使同一张照片在 http://www.celebslike.me 上会有一些名人。

所以,它们是我缺少的步骤吗?我需要先 "pre-process" 照片吗?

根据此页面:[https://www.microsoft.com/cognitive-services/en-us/computer-vision-api/documentation#Domain-Specific],它说:

Option One - Scoped Analysis

Analyze only a chosen model, by invoking an HTTP POST call. For this option, if you know which model you want to use, you just specify the model’s name, and you only get information relevant to that model. For example, you can use this option to only look for celebrity-recognition; the response will contain a list of potential matching celebrities, accompanied by their confidence scores.

Option Two - Enhanced Analysis

Analyze to provide additional details related to categories from one of the 86-category taxonomy. This option is available for use in applications where users want to get generic image analysis in addition to details from one or more domain-specific models. When this method is invoked, the 86-category taxonomy classifier is called first. If any of the categories match that of known/matching models, a second pass of classifier invocations will follow. For example, if “details=all” or "details" include “celebrities”, the method will call the celebrity classifier after the 86-category classifier is called and the result includes “object_people_celebrities”.

但是我该如何使用它呢?

这可能会让您感到惊讶,但这是按预期工作的。认知服务名人识别器经过调整以减少误报,因此在 'like-me' 类型的场景中表现不佳。

就调用服务的两个选项而言,它们在很大程度上是相同的。选项一是 "find some properties of this image and if there are any celebrities, tell me" 另一个是 "tell me the celebrities in this image, I'm not interested in any other properties." 正如你想象的那样,后者稍微更有效率。