如何在 Python 中裁剪通过 Mediapipe 检测到的人脸

How to crop face detected via Mediapipe in Python

我对 mediapipe 协调有疑问。我想做的是裁剪检测到的人脸框。

https://google.github.io/mediapipe/solutions/face_detection.html

EXAMPLE OF PROCEDURE

我使用下面的代码:

    mp_face_detection = mp.solutions.face_detection
 
# Setup the face detection function.
face_detection = mp_face_detection.FaceDetection(model_selection=0, min_detection_confidence=0.5)
 
# Initialize the mediapipe drawing class.
mp_drawing = mp.solutions.drawing_utils

# Read an image from the specified path.
sample_img = cv2.imread('12345.jpg')
 
# Specify a size of the figure.
plt.figure(figsize = [10, 10])
 
# Display the sample image, also convert BGR to RGB for display. 
plt.title("Sample Image");plt.axis('off');plt.imshow(sample_img[:,:,::-1]);plt.show()

face_detection_results = face_detection.process(sample_img[:,:,::-1])
 
# Check if the face(s) in the image are found.
if face_detection_results.detections:
    
    # Iterate over the found faces.
    for face_no, face in enumerate(face_detection_results.detections):
        
        # Display the face number upon which we are iterating upon.
        print(f'FACE NUMBER: {face_no+1}')
        print('---------------------------------')
        
        # Display the face confidence.
        print(f'FACE CONFIDENCE: {round(face.score[0], 2)}')
        
        # Get the face bounding box and face key points coordinates.
        face_data = face.location_data
        
        # Display the face bounding box coordinates.
        print(f'\nFACE BOUNDING BOX:\n{face_data.relative_bounding_box}')
        
        # Iterate two times as we only want to display first two key points of each detected face.
        for i in range(2):
 
            # Display the found normalized key points.
            print(f'{mp_face_detection.FaceKeyPoint(i).name}:')
            print(f'{face_data.relative_keypoints[mp_face_detection.FaceKeyPoint(i).value]}')

所以结果是这样的:

FACE NUMBER: 1

FACE CONFIDENCE: 0.89

FACE BOUNDING BOX:
xmin: 0.2784463167190552
ymin: 0.3503175973892212
width: 0.1538110375404358
height: 0.23071599006652832

RIGHT_EYE:
x: 0.3447018265724182
y: 0.4222590923309326

LEFT_EYE:
x: 0.39114508032798767
y: 0.3888365626335144

而且我想在 BOX 的坐标中裁剪图像。 喜欢

face = Image.fromarray(image).crop(face_rect)

或任何其他裁剪程序。 我的问题是我无法从 mediapipe 获取检测到的项目的坐标。

有什么想法吗?

伙计们找到解决方案了

import dlib
from PIL import Image
from skimage import io
h, w, c = sample_img.shape
print('width:  ', w)
print('height: ', h)
xleft = data.xmin*w
xleft = int(xleft)
xtop = data.ymin*h
xtop = int(xtop)
xright = data.width*w + xleft
xright = int(xright)
xbottom = data.height*h + xtop
xbottom = int(xbottom)
detected_faces = [(xleft, xtop, xright, xbottom)]

for n, face_rect in enumerate(detected_faces):
    face = Image.fromarray(image_c).crop(face_rect)
    face_np = np.asarray(face)
    plt.imshow(face_np)

假设,objective 是通过 mediapipe 裁剪单个检测到的人脸。注意 [0] 表明我们只对单面感兴趣

results = mp_face.process(image_input)
detection=results.detections[0]

默认情况下,mediapipe returns 以规范化形式检测数据,我们必须通过输入图像的 x 值乘以宽度和 y 值乘以高度来转换为原始大小。

我们可以使用 _normalized_to_pixel_coordinatesmediapipe

relative_bounding_box = location.relative_bounding_box
rect_start_point = _normalized_to_pixel_coordinates(
    relative_bounding_box.xmin, relative_bounding_box.ymin, image_cols,
    image_rows)
rect_end_point = _normalized_to_pixel_coordinates(
    relative_bounding_box.xmin + relative_bounding_box.width,
    relative_bounding_box.ymin + relative_bounding_box.height, image_cols,
    image_rows)

这基本上产生了

xleft,ytop=rect_start_point
xright,ybot=rect_end_point

换句话说,ytop。 ybot,xleft。 xright分别代表face_top、face_bottom、face_left、face_right。

由于图像只是一个 3D np 数组,我们可以将其裁剪如下

crop_img = image_input[ytop: ybot, xleft: xright]

完整代码如下

import cv2
import mediapipe as mp
from mediapipe.python.solutions.drawing_utils import _normalized_to_pixel_coordinates



# load face detection model
mp_face = mp.solutions.face_detection.FaceDetection(
    model_selection=1, # model selection
    min_detection_confidence=0.5 # confidence threshold
)
dframe= cv2.imread('xx.png',0)
image_rows, image_cols, _ = dframe.shape
image_input = cv2.cvtColor(dframe, cv2.COLOR_BGR2RGB)
results = mp_face.process(image_input)
detection=results.detections[0]
location = detection.location_data

relative_bounding_box = location.relative_bounding_box
rect_start_point = _normalized_to_pixel_coordinates(
    relative_bounding_box.xmin, relative_bounding_box.ymin, image_cols,
    image_rows)
rect_end_point = _normalized_to_pixel_coordinates(
    relative_bounding_box.xmin + relative_bounding_box.width,
    relative_bounding_box.ymin + relative_bounding_box.height, image_cols,
    image_rows)


## Lets draw a bounding box
color = (255, 0, 0)
thickness = 2
cv2.rectangle(image_input, rect_start_point, rect_end_point, color, thickness)
xleft,ytop=rect_start_point
xright,ybot=rect_end_point

crop_img = image_input[ytop: ybot, xleft: xright]

cv2.imwrite('crop_image0.jpg', crop_img)