从局部 3D 点和全局 2D 点重建全局 3D 点;解决即插即用
Recreating Global 3D Points, from Local 3D Points and Global 2D Points; SolvePnP
你好,
我有一个位于全局 scope/frame(图像点)中的 2D 关键点列表,以及局部范围内相应的 3D 关键点列表(通常称为纹理或对象点)。像点的范围是x[0-1920]y[0,1080],物点的范围是x[-1,1]y[-1,1]。我按照 this paper on page 6 with the tutorial from here 中描述的方法进行了操作,但是我的 3D 点的输出根本不正确,点的移动到处都是。下面是我使用 SolvePnP 的方法。我在这里走错路了吗,因为 SolvePnP 通常用于检测相机移动(欢迎其他建议!)还是我的方法有误?
import numpy as np
import cv2
array = np.array # convenience
frame1_2d = \
array([[1033.9708251953125 , 344.23065185546875],
[1077.796630859375 , 617.1146240234375 ],
[ 958.2716674804688 , 609.1179809570312 ],
[1074.8084716796875 , 782.0444946289062 ],
[ 975.2044067382812 , 418.1991882324219 ],
[1024.0103759765625 , 931.980712890625 ],
[1122.6185302734375 , 605.1196899414062 ],
[1096.721435546875 , 418.1991882324219 ],
[ 999.109375 , 617.1146240234375 ],
[ 962.255859375 , 518.1566772460938 ],
[1111.662109375 , 517.1571044921875 ],
[1014.0499877929688 , 782.0444946289062 ],
[1061.8599853515625 , 930.9811401367188 ]])
frame1_3d = \
array([[-0.01265097688883543 , -0.4992150068283081 , -0.11455678939819336 ],
[ 0.10584918409585953 , -0.0018199272453784943 , 0.0023642126470804214 ],
[-0.14271944761276245 , 0.06332945823669434 , 0.1438678503036499 ],
[ 0.09254898130893707 , 0.3176574409008026 , -0.17930322885513306 ],
[-0.1155640035867691 , -0.4058316648006439 , 0.00021289288997650146],
[-0.03301446512341499 , 0.6519031524658203 , -0.3515356183052063 ],
[ 0.14540529251098633 , 0.05645819008350372 , 0.10776595026254654 ],
[ 0.10836226493120193 , -0.4078497290611267 , 0.000870194286108017 ],
[-0.10584865510463715 , 0.001818838994950056 , -0.0023612845689058304 ],
[-0.1546039581298828 , -0.17418316006660461 , 0.10266228020191193 ],
[ 0.1590884029865265 , -0.17913128435611725 , 0.09423552453517914 ],
[-0.0736076831817627 , 0.3179360628128052 , -0.17892584204673767 ],
[ 0.05236409604549408 , 0.6490492820739746 , -0.33908188343048096 ]])
frame2_2d = \
array([[1028.110107421875 , 327.7352600097656 ],
[1068.0904541015625 , 606.7128295898438 ],
[ 982.1328125 , 229.74314880371094],
[1071.0889892578125 , 778.698974609375 ],
[ 979.13427734375 , 403.7291564941406 ],
[1013.1174926757812 , 933.6865234375 ],
[1069.0899658203125 , 243.7420196533203 ],
[1080.08447265625 , 403.7291564941406 ],
[ 997.1254272460938 , 616.7119750976562 ],
[ 983.13232421875 , 312.7364501953125 ],
[1071.0889892578125 , 317.7360534667969 ],
[1005.1214599609375 , 778.698974609375 ],
[1061.0938720703125 , 936.686279296875 ]])
frame2_3d = \
array([[-0.0004756036214530468, -0.5245562791824341 , -0.010652128607034683 ],
[ 0.10553547739982605 , -0.00272204983048141 , 0.0024587283842265606],
[-0.1196068525314331 , -0.6828885078430176 , -0.14210689067840576 ],
[ 0.0845363438129425 , 0.38039350509643555 , -0.028144780546426773 ],
[-0.11286421865224838 , -0.4302292466163635 , 0.06919233500957489 ],
[-0.030065223574638367 , 0.754790186882019 , 0.012936152517795563 ],
[ 0.1010960042476654 , -0.6289429664611816 , -0.11814753711223602 ],
[ 0.1058841198682785 , -0.4253752827644348 , 0.08086629956960678 ],
[-0.10553570091724396 , 0.002716599963605404 , -0.0024500866420567036],
[-0.127223938703537 , -0.5319695472717285 , -0.09722068160772324 ],
[ 0.11508879065513611 , -0.49151480197906494 , -0.07002018392086029 ],
[-0.06679684668779373 , 0.38714516162872314 , -0.023669833317399025 ],
[ 0.05081187188625336 , 0.7544023990631104 , -0.011078894138336182 ]])
frame3_2d = \
array([[1027.91845703125 , 338.2441711425781 ],
[1067.8787841796875 , 612.0115356445312 ],
[ 803.141357421875 , 500.10662841796875],
[1070.8758544921875 , 776.8713989257812 ],
[ 968.9768676757812 , 413.18048095703125],
[1012.9332885742188 , 925.7449340820312 ],
[1248.699462890625 , 491.1142578125 ],
[1089.8570556640625 , 412.18133544921875],
[ 995.9501342773438 , 611.0123901367188 ],
[ 871.073974609375 , 461.1397399902344 ],
[1181.765869140625 , 454.14569091796875],
[1003.9421997070312 , 775.8722534179688 ],
[1061.884765625 , 933.7380981445312 ]])
frame3_3d = \
array([[-0.003511453978717327 , -0.5015891194343567 , -0.10520103573799133 ],
[ 0.10480749607086182 , -0.00019206921570003033, -0.0004397481679916382 ],
[-0.47764456272125244 , -0.1816674768924713 , 0.04093759506940842 ],
[ 0.0936243087053299 , 0.3628539443016052 , -0.09391097724437714 ],
[-0.11445926129817963 , -0.41107428073883057 , 0.01644478738307953 ],
[-0.03567686676979065 , 0.720417320728302 , -0.10493464022874832 ],
[ 0.4529808759689331 , -0.18383921682834625 , -0.02210136130452156 ],
[ 0.1092790886759758 , -0.41095152497291565 , 0.011709243059158325 ],
[-0.10480757057666779 , 0.00018716813065111637, 0.0004445519298315048 ],
[-0.3031604290008545 , -0.2810187041759491 , 0.07747684419155121 ],
[ 0.3006024956703186 , -0.28319910168647766 , 0.043038371950387955 ],
[-0.07087739557027817 , 0.35837966203689575 , -0.08430898934602737 ],
[ 0.062416717410087585 , 0.7248380780220032 , -0.13536334037780762 ]])
#frame1_2d = np.asarray(frame1_2d, dtype=float)
#frame1_3d = np.asarray(frame1_3d, dtype=float)
#frame2_2d = np.asarray(frame2_2d, dtype=float)
#frame2_3d = np.asarray(frame2_3d, dtype=float)
#frame3_2d = np.asarray(frame3_2d, dtype=float)
#frame3_3d = np.asarray(frame3_3d, dtype=float)
# Globalize 3D Points
dist_coeffs = (0.11480806073904032, -0.21946985653851792, 0.0012002116999769957, 0.008564577708855225, 0.11274677130853494)
camera_matrix = np.asarray([
[1394.6027293299926, 0.0, 995.588675691456],
[0.0, 1394.6027293299926, 599.3212928484164],
[0.0, 0.0, 1]
])
# create rotation matrix of points
(success, rotation_vector, translation_vector) = cv2.solvePnP(frame3_3d, frame3_2d, camera_matrix, dist_coeffs, flags=0)
r_matrix = cv2.Rodrigues(rotation_vector)
rotation_matrix = np.zeros((4, 4))
rotation_matrix[:3, :3], _ = cv2.Rodrigues(rotation_vector)
rotation_matrix[:3, 3] = np.transpose(translation_vector)
rotation_matrix[3, 3] = 1
# apply rotation matrix to points
globalized_3d = np.c_[frame1_3d, np.ones((13, 1))]
for j in range(13):
globalized_3d[j, :] = np.dot(rotation_matrix, globalized_3d[j, :])
print(globalized_3d)
提前致谢,感谢任何帮助!
编辑:在我的代码中包含了一些示例,在改进了最佳答案所建议的内容之后
Edit2:使用 flag=1 显着提高了性能/减少了很多抖动!
- 是的,
solvePnP
可以用
- 是的,你的数学错了
我假设您是从面部标志检测器中获取点的,因此它们具有固定的顺序。我还假设您的 3D 模型点以相同的顺序给出,并且它们的值是一致的并且与您看到的脸有点相似。您应该排除表示肉和下颌骨(与颅骨相对)的点。你实际上想要追踪头骨,而不是到处移动的嘴唇和下巴的位置。
rvec
是一个 axis-angle 编码。它的长度是旋转量(预计在0到3.14=pi之间),它的方向是旋转轴。
使用cv.Rodrigues
将rvec
转为3x3旋转矩阵。
实际上,只需自己构建一些函数,这些函数采用 rvec 和 tvec 并构建一个 4x4 矩阵。将所有点扩展为 (x,y,z,1) 很麻烦,但只有一次。
和确保使用@
进行矩阵乘法(或np.dot
、np.matmul
, ...) 因为 *
是 element-wise 乘法。
你好, 我有一个位于全局 scope/frame(图像点)中的 2D 关键点列表,以及局部范围内相应的 3D 关键点列表(通常称为纹理或对象点)。像点的范围是x[0-1920]y[0,1080],物点的范围是x[-1,1]y[-1,1]。我按照 this paper on page 6 with the tutorial from here 中描述的方法进行了操作,但是我的 3D 点的输出根本不正确,点的移动到处都是。下面是我使用 SolvePnP 的方法。我在这里走错路了吗,因为 SolvePnP 通常用于检测相机移动(欢迎其他建议!)还是我的方法有误?
import numpy as np
import cv2
array = np.array # convenience
frame1_2d = \
array([[1033.9708251953125 , 344.23065185546875],
[1077.796630859375 , 617.1146240234375 ],
[ 958.2716674804688 , 609.1179809570312 ],
[1074.8084716796875 , 782.0444946289062 ],
[ 975.2044067382812 , 418.1991882324219 ],
[1024.0103759765625 , 931.980712890625 ],
[1122.6185302734375 , 605.1196899414062 ],
[1096.721435546875 , 418.1991882324219 ],
[ 999.109375 , 617.1146240234375 ],
[ 962.255859375 , 518.1566772460938 ],
[1111.662109375 , 517.1571044921875 ],
[1014.0499877929688 , 782.0444946289062 ],
[1061.8599853515625 , 930.9811401367188 ]])
frame1_3d = \
array([[-0.01265097688883543 , -0.4992150068283081 , -0.11455678939819336 ],
[ 0.10584918409585953 , -0.0018199272453784943 , 0.0023642126470804214 ],
[-0.14271944761276245 , 0.06332945823669434 , 0.1438678503036499 ],
[ 0.09254898130893707 , 0.3176574409008026 , -0.17930322885513306 ],
[-0.1155640035867691 , -0.4058316648006439 , 0.00021289288997650146],
[-0.03301446512341499 , 0.6519031524658203 , -0.3515356183052063 ],
[ 0.14540529251098633 , 0.05645819008350372 , 0.10776595026254654 ],
[ 0.10836226493120193 , -0.4078497290611267 , 0.000870194286108017 ],
[-0.10584865510463715 , 0.001818838994950056 , -0.0023612845689058304 ],
[-0.1546039581298828 , -0.17418316006660461 , 0.10266228020191193 ],
[ 0.1590884029865265 , -0.17913128435611725 , 0.09423552453517914 ],
[-0.0736076831817627 , 0.3179360628128052 , -0.17892584204673767 ],
[ 0.05236409604549408 , 0.6490492820739746 , -0.33908188343048096 ]])
frame2_2d = \
array([[1028.110107421875 , 327.7352600097656 ],
[1068.0904541015625 , 606.7128295898438 ],
[ 982.1328125 , 229.74314880371094],
[1071.0889892578125 , 778.698974609375 ],
[ 979.13427734375 , 403.7291564941406 ],
[1013.1174926757812 , 933.6865234375 ],
[1069.0899658203125 , 243.7420196533203 ],
[1080.08447265625 , 403.7291564941406 ],
[ 997.1254272460938 , 616.7119750976562 ],
[ 983.13232421875 , 312.7364501953125 ],
[1071.0889892578125 , 317.7360534667969 ],
[1005.1214599609375 , 778.698974609375 ],
[1061.0938720703125 , 936.686279296875 ]])
frame2_3d = \
array([[-0.0004756036214530468, -0.5245562791824341 , -0.010652128607034683 ],
[ 0.10553547739982605 , -0.00272204983048141 , 0.0024587283842265606],
[-0.1196068525314331 , -0.6828885078430176 , -0.14210689067840576 ],
[ 0.0845363438129425 , 0.38039350509643555 , -0.028144780546426773 ],
[-0.11286421865224838 , -0.4302292466163635 , 0.06919233500957489 ],
[-0.030065223574638367 , 0.754790186882019 , 0.012936152517795563 ],
[ 0.1010960042476654 , -0.6289429664611816 , -0.11814753711223602 ],
[ 0.1058841198682785 , -0.4253752827644348 , 0.08086629956960678 ],
[-0.10553570091724396 , 0.002716599963605404 , -0.0024500866420567036],
[-0.127223938703537 , -0.5319695472717285 , -0.09722068160772324 ],
[ 0.11508879065513611 , -0.49151480197906494 , -0.07002018392086029 ],
[-0.06679684668779373 , 0.38714516162872314 , -0.023669833317399025 ],
[ 0.05081187188625336 , 0.7544023990631104 , -0.011078894138336182 ]])
frame3_2d = \
array([[1027.91845703125 , 338.2441711425781 ],
[1067.8787841796875 , 612.0115356445312 ],
[ 803.141357421875 , 500.10662841796875],
[1070.8758544921875 , 776.8713989257812 ],
[ 968.9768676757812 , 413.18048095703125],
[1012.9332885742188 , 925.7449340820312 ],
[1248.699462890625 , 491.1142578125 ],
[1089.8570556640625 , 412.18133544921875],
[ 995.9501342773438 , 611.0123901367188 ],
[ 871.073974609375 , 461.1397399902344 ],
[1181.765869140625 , 454.14569091796875],
[1003.9421997070312 , 775.8722534179688 ],
[1061.884765625 , 933.7380981445312 ]])
frame3_3d = \
array([[-0.003511453978717327 , -0.5015891194343567 , -0.10520103573799133 ],
[ 0.10480749607086182 , -0.00019206921570003033, -0.0004397481679916382 ],
[-0.47764456272125244 , -0.1816674768924713 , 0.04093759506940842 ],
[ 0.0936243087053299 , 0.3628539443016052 , -0.09391097724437714 ],
[-0.11445926129817963 , -0.41107428073883057 , 0.01644478738307953 ],
[-0.03567686676979065 , 0.720417320728302 , -0.10493464022874832 ],
[ 0.4529808759689331 , -0.18383921682834625 , -0.02210136130452156 ],
[ 0.1092790886759758 , -0.41095152497291565 , 0.011709243059158325 ],
[-0.10480757057666779 , 0.00018716813065111637, 0.0004445519298315048 ],
[-0.3031604290008545 , -0.2810187041759491 , 0.07747684419155121 ],
[ 0.3006024956703186 , -0.28319910168647766 , 0.043038371950387955 ],
[-0.07087739557027817 , 0.35837966203689575 , -0.08430898934602737 ],
[ 0.062416717410087585 , 0.7248380780220032 , -0.13536334037780762 ]])
#frame1_2d = np.asarray(frame1_2d, dtype=float)
#frame1_3d = np.asarray(frame1_3d, dtype=float)
#frame2_2d = np.asarray(frame2_2d, dtype=float)
#frame2_3d = np.asarray(frame2_3d, dtype=float)
#frame3_2d = np.asarray(frame3_2d, dtype=float)
#frame3_3d = np.asarray(frame3_3d, dtype=float)
# Globalize 3D Points
dist_coeffs = (0.11480806073904032, -0.21946985653851792, 0.0012002116999769957, 0.008564577708855225, 0.11274677130853494)
camera_matrix = np.asarray([
[1394.6027293299926, 0.0, 995.588675691456],
[0.0, 1394.6027293299926, 599.3212928484164],
[0.0, 0.0, 1]
])
# create rotation matrix of points
(success, rotation_vector, translation_vector) = cv2.solvePnP(frame3_3d, frame3_2d, camera_matrix, dist_coeffs, flags=0)
r_matrix = cv2.Rodrigues(rotation_vector)
rotation_matrix = np.zeros((4, 4))
rotation_matrix[:3, :3], _ = cv2.Rodrigues(rotation_vector)
rotation_matrix[:3, 3] = np.transpose(translation_vector)
rotation_matrix[3, 3] = 1
# apply rotation matrix to points
globalized_3d = np.c_[frame1_3d, np.ones((13, 1))]
for j in range(13):
globalized_3d[j, :] = np.dot(rotation_matrix, globalized_3d[j, :])
print(globalized_3d)
提前致谢,感谢任何帮助!
编辑:在我的代码中包含了一些示例,在改进了最佳答案所建议的内容之后
Edit2:使用 flag=1 显着提高了性能/减少了很多抖动!
- 是的,
solvePnP
可以用 - 是的,你的数学错了
我假设您是从面部标志检测器中获取点的,因此它们具有固定的顺序。我还假设您的 3D 模型点以相同的顺序给出,并且它们的值是一致的并且与您看到的脸有点相似。您应该排除表示肉和下颌骨(与颅骨相对)的点。你实际上想要追踪头骨,而不是到处移动的嘴唇和下巴的位置。
rvec
是一个 axis-angle 编码。它的长度是旋转量(预计在0到3.14=pi之间),它的方向是旋转轴。
使用cv.Rodrigues
将rvec
转为3x3旋转矩阵。
实际上,只需自己构建一些函数,这些函数采用 rvec 和 tvec 并构建一个 4x4 矩阵。将所有点扩展为 (x,y,z,1) 很麻烦,但只有一次。
和确保使用@
进行矩阵乘法(或np.dot
、np.matmul
, ...) 因为 *
是 element-wise 乘法。