OpenCV:来自 decomposeHomographyMat 的奇怪旋转和平移矩阵

OpenCV: Strange rotation and translation matrices from decomposeHomographyMat

我正在尝试使用 OpenCV 库中的 findHomography 函数解决 plane2plane 投影问题。作为玩具示例,我在 R2 中有一组点 P,在 R2 中有第二组点 Q,其中 Qx = Px+50,Qy = Py。这意味着我已经将 x 坐标偏移了 50。现在我 运行 以下代码:

Mat projectionMatrix = findHomography(Q, P);
vector<Point2f> projectedPoints(objectCoordinates.size());
perspectiveTransform(Q, projectedPoints, projectionMatrix);

这给了我 P,这很棒。但是,现在我想要旋转和平移矩阵,R&T,这就是我感到困惑的地方。 OpenCV 3 有一个名为 decomposeHomographyMat 的函数,它 returns 最多 4 个 R 和 T 的解决方案(还有 returns 法线,但我不存储它们)。我 运行 是这样的:

vector<Mat> Rs;
vector<Mat> Ts;
decomposeHomographyMat(projectionMatrix, cameraMatrix, Rs, Ts, noArray());

我使用的 cameraMatrix 是从以前的实验中尝试和测试的。好的,所以我得到了我的四个结果。看着它们,我注意到我得到了所有 R 的单位矩阵,这很棒。但是,所有平移向量都是 [0,0,0]T,而我希望其中至少有一个是 [-50,0,0]T。我需要对 decomposeHomographyMat 的结果做些什么才能获得预期的行为吗?

谢谢

原来我有些地方错了,所以我决定重写这个答案。

简而言之 - 由于不正确的内部参数矩阵,您得到了奇怪的结果。

使用论文"Malis, E and Vargas, M, "深入理解基于视觉控制的单应性分解”(OpenCV中的单应性分解基于该论文)中的术语,透视变换表示为H,并称为欧氏单应矩阵,其归一化结果G = K^-1 * H * K (其中 K 是相机的校准矩阵) 单应矩阵

cv::findHomography()cv::decomposeHomographyMat() 都使用 欧几里德单应矩阵 。但是为了将其分解为平移和旋转,cv::decomposeHomographyMat()欧氏单应矩阵归一化得到单应矩阵。它依赖于用户提供的 K 来执行此规范化。

至于K的估计,我觉得不在本题范围内。这个问题叫做 Camera auto-calibration,这里引用这篇维基文章的相关内容:

Therefore, three views are the minimum needed for full calibration with fixed intrinsic parameters between views. Quality modern imaging sensors and optics may also provide further prior constraints on the calibration such as zero skew (orthogonal pixel grid) and unity aspect ratio (square pixels). Integrating these priors will reduce the minimal number of images needed to two.

看来,在零偏斜和正方形像素假设下,您可以从来自同一相机的 2 帧以上的图像对应关系中提取 K。但是,我不熟悉这个主题,所以不能给你更多的建议。

所以,为了检查我的解释是否正确,我做了一个小例子,将一些点投影到 2 个虚拟相机的 3D 平面上,找到单应性,分解它,并允许将这个分解与地面进行比较真值旋转和平移向量。这比实际输入要好,因为这样我们就可以准确地知道 K,并且可以将其估计误差与 Rt。对于我检查过的输入,它能够正确估计旋转和平移向量,尽管出于某种原因,平移总是比地面实况小 10 倍。也许这种分解只定义了一个比例(我现在不确定),但有趣的是,它与具有固定系数的地面真实值有关。

来源如下:

#include <opencv2/opencv.hpp>
#include <iostream>
#include <vector>


int main() {
  // set up a virtual camera
  float f = 100, w = 640, h = 480;

  cv::Mat1f K = (cv::Mat1f(3, 3) <<
      f, 0, w/2,
      0, f, h/2,
      0, 0,   1);

  // set transformation from 1st to 2nd camera (assume K is unchanged)
  cv::Mat1f rvecDeg = (cv::Mat1f(3, 1) << 45, 12, 66);
  cv::Mat1f t = (cv::Mat1f(3, 1) << 100, 200, 300);

  std::cout << "-------------------------------------------\n";
  std::cout << "Ground truth:\n";

  std::cout << "K = \n" << K << std::endl << std::endl;
  std::cout << "rvec = \n" << rvecDeg << std::endl << std::endl;
  std::cout << "t = \n" << t << std::endl << std::endl;

  // set up points on a plane
  std::vector<cv::Point3f> p3d{{0, 0, 10},
                               {100, 0, 10},
                               {0, 100, 10},
                               {100, 100, 10}};

  // project on both cameras
  std::vector<cv::Point2f> Q, P, S;

  cv::projectPoints(p3d,
                    cv::Mat1d::zeros(3, 1),
                    cv::Mat1d::zeros(3, 1),
                    K,
                    cv::Mat(),
                    Q);

  cv::projectPoints(p3d,
                    rvecDeg*CV_PI/180,
                    t,
                    K,
                    cv::Mat(),
                    P);

  // find homography
  cv::Mat H = cv::findHomography(Q, P);

  std::cout << "-------------------------------------------\n";
  std::cout << "Estimated H = \n" << H << std::endl << std::endl;


  // check by reprojection
  std::vector<cv::Point2f> P_(P.size());
  cv::perspectiveTransform(Q, P_, H);

  float sumError = 0;

  for (size_t i = 0; i < P.size(); i++) {
    sumError += cv::norm(P[i] - P_[i]);
  }

  std::cout << "-------------------------------------------\n";
  std::cout << "Average reprojection error = "
      << sumError/P.size() << std::endl << std::endl;


  // decompose using identity as internal parameters matrix
  std::vector<cv::Mat> Rs, Ts;
  cv::decomposeHomographyMat(H,
                             K,
                             Rs, Ts,
                             cv::noArray());

  std::cout << "-------------------------------------------\n";
  std::cout << "Estimated decomposition:\n\n";
  std::cout << "rvec = " << std::endl;
  for (auto R_ : Rs) {
    cv::Mat1d rvec;
    cv::Rodrigues(R_, rvec);
    std::cout << rvec*180/CV_PI << std::endl << std::endl;
  }

  std::cout << std::endl;

  std::cout << "t = " << std::endl;
  for (auto t_ : Ts) {
    std::cout << t_ << std::endl << std::endl;
  }

  return 0;
}

这是我机器上的输出:

-------------------------------------------
Ground truth:
K =
[100, 0, 320;
0, 100, 240;
0, 0, 1]

rvec =
[45;
12;
66]

t =
[100;
200;
300]

-------------------------------------------
Estimated H =
[0.04136041220427821, 0.04748763375951008, 358.5557917287962;
0.05074854454707714, 0.06137211243830468, 297.4585754092336;
8.294458769850147e-05, 0.0002294875562580223, 1]

-------------------------------------------
Average reprojection error = 0

-------------------------------------------
Estimated decomposition:

rvec =
[-73.21470385654712;
56.64668212487194;
82.09114210289061]

[-73.21470385654712;
56.64668212487194;
82.09114210289061]

[45.00005330430893;
12.00000697952995;
65.99998380038915]

[45.00005330430893;
12.00000697952995;
65.99998380038915]


t =
[10.76993852870029;
18.60689642878277;
30.62344129378435]

[-10.76993852870029;
-18.60689642878277;
-30.62344129378435]

[10.00001378255982;
20.00002581449634;
30.0000336510648]

[-10.00001378255982;
-20.00002581449634;
-30.0000336510648]

如你所见,假设中有旋转向量的正确估计,并且有平移的up-to-scale正确估计。

你应该在每个平面上使用 solvePnP 代替,然后从两个相机矩阵(假设你至少有 4 个点)获得相对平移和旋转。 decomposeHomographyMat 的问题是您最多可以获得 4 个解决方案...您可以删除两个将落在图像 FOV 之外的解决方案,但这仍然是一个问题。

  • @alexisrozhkov-我在您的代码中发现了错误-文章和函数都假定 Z 平面位于 1(不像您那样 z=10)。下面添加的是您上述代码的正确重新植入(已修复并在 python 中):
import cv2
import numpy as np


# set up a virtual camera
f = 100
w = 640
h = 480

K = np.array([[f, 0, w/2],
              [0, f, h/2],
              [0, 0,   1]])
dist_coffs = np.zeros(5)
# set transformation from 1st to 2nd camera (assume K is unchanged)
rvecDeg = np.array([[45, 12, 66]]).T
t = np.array([[100.0, 200, 300]]).T

print("-------------------------------------------\n")
print("Ground truth:\n")

print("K = \n" + str(K))
print("rvec = \n" + str(rvecDeg))
print("t = \n" + str(t))

# set up points on a plane
p3d = np.array([[0, 0, 1],
                [100, 0, 1],
                [0, 100, 1],
                [100, 100, 1]], dtype=np.float64)

# project on both cameras

Q, _ = cv2.projectPoints(p3d,
                         np.zeros((3, 1)),
                         np.zeros((3, 1)),
                         K,
                         dist_coffs)

P, _ = cv2.projectPoints(p3d,
                         rvecDeg*np.pi/180,
                         t,
                         K,
                         dist_coffs)

# find homography
H, _ = cv2.findHomography(Q, P)

print("-------------------------------------------\n")
print("Estimated H = \n" + str(H))


# check by reprojection
P_ = cv2.perspectiveTransform(Q, H)

sumError = 0

for i in range(P.shape[0]):
    sumError += np.linalg.norm(P[i] - P_[i])


print("-------------------------------------------\n")
print("Average reprojection error = "+str(sumError/P.shape[0]))


# decompose using identity as internal parameters matrix
num_res, Rs, ts, n = cv2.decomposeHomographyMat(H, K)

print("-------------------------------------------\n")
print("Estimated decomposition:\n\n")
for i, Rt in enumerate(zip(Rs, ts)):
    R, t = Rt
    print("option " + str(i+1))
    print("rvec = ")
    rvec, _ = cv2.Rodrigues(R)
    print(rvec*180/np.pi)
    print("t = ")
    print(t)

这是结果(选项 3 是正确的结果):

-------------------------------------------

Ground truth:

K = 
[[100.   0. 320.]
 [  0. 100. 240.]
 [  0.   0.   1.]]
rvec = 
[[45]
 [12]
 [66]]
t = 
[[100.]
 [200.]
 [300.]]
-------------------------------------------

Estimated H = 
[[3.93678513e-03 4.51998690e-03 3.53830416e+02]
 [4.83037187e-03 5.84154353e-03 3.05790229e+02]
 [7.89487379e-06 2.18431675e-05 1.00000000e+00]]
-------------------------------------------

Average reprojection error = 1.1324020050061362e-05
-------------------------------------------

Estimated decomposition:


option 1
rvec = 
[[-78.28555877]
 [ 58.01301837]
 [ 81.41634175]]
t = 
[[100.79816988]
 [198.59277542]
 [300.6675498 ]]
option 2
rvec = 
[[-78.28555877]
 [ 58.01301837]
 [ 81.41634175]]
t = 
[[-100.79816988]
 [-198.59277542]
 [-300.6675498 ]]
option 3
rvec = 
[[45.0000205 ]
 [12.00005488]
 [65.99999505]]
t = 
[[100.00010826]
 [200.00026791]
 [300.00034698]]
option 4
rvec = 
[[45.0000205 ]
 [12.00005488]
 [65.99999505]]
t = 
[[-100.00010826]
 [-200.00026791]
 [-300.00034698]]