在 Python 中使用 Azure Face Api，如果在视频流中检测到同一个人，如何 Return 单个 faceId 或一组 FaceId？

Question

我正在使用 Azure Face APi 来检测视频流中的人脸，但是对于每个检测到的人脸 Azure returns 一个唯一的 faceId（这正是文档所说的）。

问题是，假设 Mr.ABC 出现在 20 个视频帧中，生成了 20 个唯一的 faceId。我想要 Azure Face 应该 return 为我生成的单个 faceId 或一组专门为 Mr.ABC 生成的 FaceId，这样我就可以知道在镜头前停留 x 时间的是同一个人.

我已经阅读了 Azure Facegrouping 和 Azure FindSimilar 的文档，但不明白如何让它在实时视频流的情况下工作。

下面给出了我使用 Azure 人脸检测人脸的代码：

from azure.cognitiveservices.vision.face import FaceClient
from msrest.authentication import CognitiveServicesCredentials
from azure.cognitiveservices.vision.face.models import TrainingStatusType, Person, SnapshotObjectType, OperationStatusType
import cv2
import os
import requests
import sys,glob, uuid,re
from PIL import Image, ImageDraw
from urllib.parse import urlparse
from io import BytesIO
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient,__version__

face_key = 'XABC' #API key
face_endpoint = 'https://XENDPOINT.cognitiveservices.azure.com' #endpoint, e.g. 'https://westus.api.cognitive.microsoft.com'

credentials = CognitiveServicesCredentials(face_key)
face_client = FaceClient(face_endpoint, credentials)

camera = cv2.VideoCapture(0)
samplenum =1
im = ""
work_dir = os.getcwd()

person_group_id = 'test02-group'
target_person_group_id = str(uuid.uuid4())
face_ids = []

#cv2 font
font = cv2.FONT_HERSHEY_SIMPLEX
#empty tuple
width = ()
height = ()
left=0
bottom=0
def getRectangle(faceDictionary):
    rect = faceDictionary.face_rectangle
    left = rect.left
    top = rect.top
    right = left + rect.width
    bottom = top + rect.height
    
    return ((left, top), (right, bottom))

while True:
    check,campic = camera.read()
    samplenum=samplenum+1
    cv2.imwrite("live_pics/"+str(samplenum)+".jpg",campic)
    path = work_dir+"/live_pics/"+str(samplenum)+".jpg"
    #im = cv2.imread("pics/"+str(samplenum)+".jpg")
    stream = open(path, "r+b")
    detected_faces = face_client.face.detect_with_stream(
        stream,
        return_face_id=True,
        return_face_attributes=['age','gender','emotion'],recognitionModel="recognition_03")
    for face in detected_faces:
        width,height = getRectangle(face)
        cv2.rectangle(campic,width,height,(0,0,170),2)
        face_ids.append(face.face_id)
    #cv2.waitKey(100);
    if(samplenum>10):
        break
    cv2.imshow("campic", campic)
    if cv2.waitKey(1) == ord("q"):
        break

camera.release()
cv2.destroyAllWindows()

Answer 1

人脸没有魔法API：你必须对找到的每张人脸进行 2 步处理。

我的建议是使用“查找相似”：

一开始，创建一个“FaceList”
然后处理您的视频：
- 每帧人脸检测
- 对于找到的每张人脸，在创建的人脸列表上使用查找相似操作。如果没有匹配（有足够的置信度），将人脸添加到人脸列表中。

最后，您的面孔列表将包含在视频中找到的所有不同人物。

对于您的实时用例，不要对 PersonGroup / LargePersonGroup 使用“Identify”操作（这 2 个之间的选择取决于组的大小），因为您会因需要训练而陷入困境群组。例如，您将执行以下操作：

第 1 步，1 次：为本次执行生成 PersonGroup / LargePersonGroup
第2步，N次（对于每张你想要识别人脸的图片）：
- 步骤 2a：人脸检测
- 步骤 2b：根据 PersonGroup / LargePersonGroup 在每个检测到的人脸上“识别”
- 步骤 2c：对于每张未识别的面孔，将其添加到 PersonGroup / LargePersonGroup。

这里的问题是，在 2c 之后，你必须再次训练你的组。就算不长，也不能实时使用，太长了。

Answer 2

据我了解，您想显示一个人的 name/identity 而不是从人脸 API 中检测到的人脸 ID。

如果是这样，在通过人脸检测API获取人脸ID后，您应该使用Face Identify API to do this. You can get a person ID if faces could be recognized by Azure Face service, With this ID, you can just use PersonGroup Person API获取此人的信息。

我也给大家写了一个简单的demo，在这个demo中，只有一张图片，我们可以把它想象成一个视频帧。我创建了一个人组，里面有一个超人，加了一些人脸。

这是下面的代码：

import matplotlib.pyplot as plt
import matplotlib.patches as patches
from PIL import Image
import numpy as np
import asyncio
import io
import glob
import os
import sys
import time
import uuid
import requests
from urllib.parse import urlparse
from io import BytesIO
from azure.cognitiveservices.vision.face import FaceClient
from msrest.authentication import CognitiveServicesCredentials

imPath = "<image path>";
ENDPOINT = '<endpoint>'
KEY = '<key>'
PERSON_GROUP_ID = '<person group name>'

face_client = FaceClient(ENDPOINT, CognitiveServicesCredentials(KEY))
im = np.array(Image.open(imPath), dtype=np.uint8)

faces = face_client.face.detect_with_stream(open(imPath, 'r+b'),recognition_model='recognition_03');

# Create figure and axes
fig,ax = plt.subplots()

 # Display the image
ax.imshow(im)

for i in range(len(faces)):
    face = faces[i]
    rect =patches.Rectangle((face.face_rectangle.left,face.face_rectangle.top),face.face_rectangle.height,face.face_rectangle.width,linewidth=1,edgecolor='r',facecolor='none')
    detected_person = face_client.face.identify([face.face_id],PERSON_GROUP_ID)[0]
    if(len(detected_person.candidates) > 0):
        person_id = detected_person.candidates[0].person_id
        person = face_client.person_group_person.get(PERSON_GROUP_ID,person_id)
        plt.text(face.face_rectangle.left,face.face_rectangle.top,person.name,color='r')
    else:
        plt.text(face.face_rectangle.left,face.face_rectangle.top,'unknown',color='r')

    
    ax.add_patch(rect)

plt.show()

结果：

在 Python 中使用 Azure Face Api，如果在视频流中检测到同一个人，如何 Return 单个 faceId 或一组 FaceId？

Using Azure Face Api in Python, How to Return a single faceId or a group of FaceIds if the same person is detected in Video Stream?

azure

microsoft-cognitive

face-api

azure-cognitive-services

facial-identification