Python-OpenCV 中头部姿势的仰角如何工作?
How does Elevation of a Head Pose in Python-OpenCV work?
我正在尝试估计主要遵循本指南的单张图像的头部姿势:
https://towardsdatascience.com/real-time-head-pose-estimation-in-python-e52db1bc606a
面部检测工作正常 - 如果我绘制图像和检测到的地标,它们会很好地对齐。
我正在根据图像估计相机矩阵,并假设没有镜头失真:
size = image.shape
focal_length = size[1]
center = (size[1]/2, size[0]/2)
camera_matrix = np.array([[focal_length, 0, center[0]],
[0, focal_length, center[1]],
[0, 0, 1]], dtype="double")
dist_coeffs = np.zeros((4, 1)) # Assuming no lens distortion
我正在尝试通过使用 solvePNP 将图像中的点与 3D 模型中的点进行匹配来获得头部姿势:
# 3D-model points to which the points extracted from an image are matched:
model_points = np.array([
(0.0, 0.0, 0.0), # Nose tip
(0.0, -330.0, -65.0), # Chin
(-225.0, 170.0, -135.0), # Left eye corner
(225.0, 170.0, -135.0), # Right eye corner
(-150.0, -150.0, -125.0), # Left Mouth corner
(150.0, -150.0, -125.0) # Right mouth corner
])
image_points = np.array([
shape[30], # Nose tip
shape[8], # Chin
shape[36], # Left eye left corner
shape[45], # Right eye right corne
shape[48], # Left Mouth corner
shape[54] # Right mouth corner
], dtype="double")
success, rotation_vec, translation_vec) = \
cv2.solvePnP(model_points, image_points, camera_matrix, dist_coeffs)
最后,我得到了旋转的欧拉角:
rotation_mat, _ = cv2.Rodrigues(rotation_vec)
pose_mat = cv2.hconcat((rotation_mat, translation_vec))
_, _, _, _, _, _, angles = cv2.decomposeProjectionMatrix(pose_mat)
现在方位角是我所期望的 - 如果我向左看,它是负的,中间是零,向右看是正的。
但是海拔很奇怪 - 如果我在中间看它有一个恒定值但符号是随机的 - 从图像到图像变化(值大约为 170)。
当我向上看时,符号是正的,越往上看值越小,
往下看符号是负的,越往下看数值越低
有人可以向我解释一下这个输出吗?
好的,看来我找到了解决方案 - 模型点(我在几个关于该主题的博客中找到的)似乎是错误的。该代码似乎适用于这种模型和图像点的组合(不知道为什么要反复试验):
model_points = np.float32([[6.825897, 6.760612, 4.402142],
[1.330353, 7.122144, 6.903745],
[-1.330353, 7.122144, 6.903745],
[-6.825897, 6.760612, 4.402142],
[5.311432, 5.485328, 3.987654],
[1.789930, 5.393625, 4.413414],
[-1.789930, 5.393625, 4.413414],
[-5.311432, 5.485328, 3.987654],
[2.005628, 1.409845, 6.165652],
[-2.005628, 1.409845, 6.165652],
[2.774015, -2.080775, 5.048531],
[-2.774015, -2.080775, 5.048531],
[0.000000, -3.116408, 6.097667],
[0.000000, -7.415691, 4.070434]])
image_points = np.float32([shape[17], shape[21], shape[22], shape[26],
shape[36], shape[39], shape[42], shape[45],
shape[31], shape[35], shape[48], shape[54],
shape[57], shape[8]])
我正在尝试估计主要遵循本指南的单张图像的头部姿势: https://towardsdatascience.com/real-time-head-pose-estimation-in-python-e52db1bc606a
面部检测工作正常 - 如果我绘制图像和检测到的地标,它们会很好地对齐。
我正在根据图像估计相机矩阵,并假设没有镜头失真:
size = image.shape
focal_length = size[1]
center = (size[1]/2, size[0]/2)
camera_matrix = np.array([[focal_length, 0, center[0]],
[0, focal_length, center[1]],
[0, 0, 1]], dtype="double")
dist_coeffs = np.zeros((4, 1)) # Assuming no lens distortion
我正在尝试通过使用 solvePNP 将图像中的点与 3D 模型中的点进行匹配来获得头部姿势:
# 3D-model points to which the points extracted from an image are matched:
model_points = np.array([
(0.0, 0.0, 0.0), # Nose tip
(0.0, -330.0, -65.0), # Chin
(-225.0, 170.0, -135.0), # Left eye corner
(225.0, 170.0, -135.0), # Right eye corner
(-150.0, -150.0, -125.0), # Left Mouth corner
(150.0, -150.0, -125.0) # Right mouth corner
])
image_points = np.array([
shape[30], # Nose tip
shape[8], # Chin
shape[36], # Left eye left corner
shape[45], # Right eye right corne
shape[48], # Left Mouth corner
shape[54] # Right mouth corner
], dtype="double")
success, rotation_vec, translation_vec) = \
cv2.solvePnP(model_points, image_points, camera_matrix, dist_coeffs)
最后,我得到了旋转的欧拉角:
rotation_mat, _ = cv2.Rodrigues(rotation_vec)
pose_mat = cv2.hconcat((rotation_mat, translation_vec))
_, _, _, _, _, _, angles = cv2.decomposeProjectionMatrix(pose_mat)
现在方位角是我所期望的 - 如果我向左看,它是负的,中间是零,向右看是正的。
但是海拔很奇怪 - 如果我在中间看它有一个恒定值但符号是随机的 - 从图像到图像变化(值大约为 170)。
当我向上看时,符号是正的,越往上看值越小, 往下看符号是负的,越往下看数值越低
有人可以向我解释一下这个输出吗?
好的,看来我找到了解决方案 - 模型点(我在几个关于该主题的博客中找到的)似乎是错误的。该代码似乎适用于这种模型和图像点的组合(不知道为什么要反复试验):
model_points = np.float32([[6.825897, 6.760612, 4.402142],
[1.330353, 7.122144, 6.903745],
[-1.330353, 7.122144, 6.903745],
[-6.825897, 6.760612, 4.402142],
[5.311432, 5.485328, 3.987654],
[1.789930, 5.393625, 4.413414],
[-1.789930, 5.393625, 4.413414],
[-5.311432, 5.485328, 3.987654],
[2.005628, 1.409845, 6.165652],
[-2.005628, 1.409845, 6.165652],
[2.774015, -2.080775, 5.048531],
[-2.774015, -2.080775, 5.048531],
[0.000000, -3.116408, 6.097667],
[0.000000, -7.415691, 4.070434]])
image_points = np.float32([shape[17], shape[21], shape[22], shape[26],
shape[36], shape[39], shape[42], shape[45],
shape[31], shape[35], shape[48], shape[54],
shape[57], shape[8]])