R ginv 和 Matlab pinv 产生不同的结果
R ginv and Matlab pinv produce different results
与 MATLAB pinv()
函数相比,R 中 MASS
包中的 ginv()
函数产生完全不同的值。他们都声称可以生成 Moore-Penrose 广义逆矩阵。
我尝试为 R 实现设置相同的容差,但差异仍然存在。
- MATLAB 默认公差:
max(size(A)) * norm(A) * eps(class(A))
- R 默认公差:
sqrt(.Machine$double.eps)
复制:
R:
library(MASS)
A <- matrix(c(47,94032,149, 94032, 217179406,313679,149,313679,499),3,3)
ginv(A)
输出:
[,1] [,2] [,3]
[1,] 1.675667e-03 -8.735203e-06 5.545605e-03
[2,] -8.735203e-06 5.014084e-08 -2.890907e-05
[3,] 5.545605e-03 -2.890907e-05 1.835313e-02
svd(A)
输出:
$d
[1] 2.171799e+08 4.992800e+01 2.302544e+00
$u
[,1] [,2] [,3]
[1,] -0.0004329688 0.289245088 -9.572550e-01
[2,] -0.9999988632 -0.001507826 -3.304234e-06
[3,] -0.0014443299 0.957253888 2.892454e-01
$v
[,1] [,2] [,3]
[1,] -0.0004329688 0.289245088 -9.572550e-01
[2,] -0.9999988632 -0.001507826 -3.304234e-06
[3,] -0.0014443299 0.957253888 2.892454e-01
MATLAB:
A = [47 94032 149; 94032 217179406 313679; 149 313679 499]
pinv(A)
输出:
ans =
0.3996 -0.0000 -0.1147
-0.0000 0.0000 -0.0000
-0.1147 -0.0000 0.0547
svd:
[U, S, V] = svd(A)
U =
-0.0004 0.2892 -0.9573
-1.0000 -0.0015 -0.0000
-0.0014 0.9573 0.2892
S =
1.0e+008 *
2.1718 0 0
0 0.0000 0
0 0 0.0000
V =
-0.0004 0.2892 -0.9573
-1.0000 -0.0015 -0.0000
-0.0014 0.9573 0.2892
解决方法:
使 R ginv
像 MATLAB pinv
使用此函数:
#' Pseudo-Inverse of Matrix
#' @description
#' This is the modified version of ginv function in MASS package.
#' It produces MATLAB like pseudo-inverse of a matrix
#' @param X The matrix to compute the pseudo-inverse
#' @param tol The default is the same as MATLAB pinv function
#'
#' @return The pseudo inverse of the matrix
#' @export
#'
#' @examples
#' A <- matrix(1:6,3,2)
#' pinv(A)
pinv <- function (X, tol = max(dim(X)) * max(X) * .Machine$double.eps)
{
if (length(dim(X)) > 2L || !(is.numeric(X) || is.complex(X)))
stop("'X' must be a numeric or complex matrix")
if (!is.matrix(X))
X <- as.matrix(X)
Xsvd <- svd(X)
if (is.complex(X))
Xsvd$u <- Conj(Xsvd$u)
Positive <- any(Xsvd$d > max(tol * Xsvd$d[1L], 0))
if (Positive)
Xsvd$v %*% (1 / Xsvd$d * t(Xsvd$u))
else
array(0, dim(X)[2L:1L])
}
运行debugonce(MASS::ginv)
,我们看到区别在于奇异值分解做了什么。
具体来说,R 检查以下内容:
Xsvd <- svd(A)
Positive <- Xsvd$d > max(tol * Xsvd$d[1L], 0)
Positive
# [1] TRUE TRUE FALSE
如果第三个元素为真(我们可以通过设置 tol = 0
强制执行,正如@nicola 所建议的),MASS::ginv
将 return:
Xsvd$v %*% (1/Xsvd$d * t(Xsvd$u))
# [,1] [,2] [,3]
# [1,] 3.996430e-01 -7.361507e-06 -1.147047e-01
# [2,] -7.361507e-06 5.014558e-08 -2.932415e-05
# [3,] -1.147047e-01 -2.932415e-05 5.468812e-02
(即与 MATLAB 相同)。
相反,它 returns:
Xsvd$v[, Positive, drop = FALSE] %*% ((1/Xsvd$d[Positive]) *
t(Xsvd$u[, Positive, drop = FALSE]))
# [,1] [,2] [,3]
# [1,] 1.675667e-03 -8.735203e-06 5.545605e-03
# [2,] -8.735203e-06 5.014084e-08 -2.890907e-05
# [3,] 5.545605e-03 -2.890907e-05 1.835313e-02
感谢@FaridCher 指出 pinv
的源代码。
我不确定我是否 100% 理解 MATLAB 代码,但我认为这归结为 tol
的使用方式不同。 R中Positive
的MATLAB对应为:
`r = sum(s>tol)`
其中tol
是用户提供的;如果提供 none,我们得到:
m = 0;
% I don't get the point of this for loop -- why not just `m = max(size(A))`?
for i = 1:n
m = max(m,length(A(:,i)));
end
% contrast with simply `tol * Xsvd$d[1L]` in R
% (note: i believe the elements of d are sorted largest to smallest)
tol = m*eps(max(s));
pinv()
函数相比,R 中 MASS
包中的 ginv()
函数产生完全不同的值。他们都声称可以生成 Moore-Penrose 广义逆矩阵。
我尝试为 R 实现设置相同的容差,但差异仍然存在。
- MATLAB 默认公差:
max(size(A)) * norm(A) * eps(class(A))
- R 默认公差:
sqrt(.Machine$double.eps)
复制:
R:
library(MASS)
A <- matrix(c(47,94032,149, 94032, 217179406,313679,149,313679,499),3,3)
ginv(A)
输出:
[,1] [,2] [,3]
[1,] 1.675667e-03 -8.735203e-06 5.545605e-03
[2,] -8.735203e-06 5.014084e-08 -2.890907e-05
[3,] 5.545605e-03 -2.890907e-05 1.835313e-02
svd(A)
输出:
$d
[1] 2.171799e+08 4.992800e+01 2.302544e+00
$u
[,1] [,2] [,3]
[1,] -0.0004329688 0.289245088 -9.572550e-01
[2,] -0.9999988632 -0.001507826 -3.304234e-06
[3,] -0.0014443299 0.957253888 2.892454e-01
$v
[,1] [,2] [,3]
[1,] -0.0004329688 0.289245088 -9.572550e-01
[2,] -0.9999988632 -0.001507826 -3.304234e-06
[3,] -0.0014443299 0.957253888 2.892454e-01
MATLAB:
A = [47 94032 149; 94032 217179406 313679; 149 313679 499]
pinv(A)
输出:
ans =
0.3996 -0.0000 -0.1147
-0.0000 0.0000 -0.0000
-0.1147 -0.0000 0.0547
svd:
[U, S, V] = svd(A)
U =
-0.0004 0.2892 -0.9573
-1.0000 -0.0015 -0.0000
-0.0014 0.9573 0.2892
S =
1.0e+008 *
2.1718 0 0
0 0.0000 0
0 0 0.0000
V =
-0.0004 0.2892 -0.9573
-1.0000 -0.0015 -0.0000
-0.0014 0.9573 0.2892
解决方法:
使 R ginv
像 MATLAB pinv
使用此函数:
#' Pseudo-Inverse of Matrix
#' @description
#' This is the modified version of ginv function in MASS package.
#' It produces MATLAB like pseudo-inverse of a matrix
#' @param X The matrix to compute the pseudo-inverse
#' @param tol The default is the same as MATLAB pinv function
#'
#' @return The pseudo inverse of the matrix
#' @export
#'
#' @examples
#' A <- matrix(1:6,3,2)
#' pinv(A)
pinv <- function (X, tol = max(dim(X)) * max(X) * .Machine$double.eps)
{
if (length(dim(X)) > 2L || !(is.numeric(X) || is.complex(X)))
stop("'X' must be a numeric or complex matrix")
if (!is.matrix(X))
X <- as.matrix(X)
Xsvd <- svd(X)
if (is.complex(X))
Xsvd$u <- Conj(Xsvd$u)
Positive <- any(Xsvd$d > max(tol * Xsvd$d[1L], 0))
if (Positive)
Xsvd$v %*% (1 / Xsvd$d * t(Xsvd$u))
else
array(0, dim(X)[2L:1L])
}
运行debugonce(MASS::ginv)
,我们看到区别在于奇异值分解做了什么。
具体来说,R 检查以下内容:
Xsvd <- svd(A)
Positive <- Xsvd$d > max(tol * Xsvd$d[1L], 0)
Positive
# [1] TRUE TRUE FALSE
如果第三个元素为真(我们可以通过设置 tol = 0
强制执行,正如@nicola 所建议的),MASS::ginv
将 return:
Xsvd$v %*% (1/Xsvd$d * t(Xsvd$u))
# [,1] [,2] [,3]
# [1,] 3.996430e-01 -7.361507e-06 -1.147047e-01
# [2,] -7.361507e-06 5.014558e-08 -2.932415e-05
# [3,] -1.147047e-01 -2.932415e-05 5.468812e-02
(即与 MATLAB 相同)。
相反,它 returns:
Xsvd$v[, Positive, drop = FALSE] %*% ((1/Xsvd$d[Positive]) *
t(Xsvd$u[, Positive, drop = FALSE]))
# [,1] [,2] [,3]
# [1,] 1.675667e-03 -8.735203e-06 5.545605e-03
# [2,] -8.735203e-06 5.014084e-08 -2.890907e-05
# [3,] 5.545605e-03 -2.890907e-05 1.835313e-02
感谢@FaridCher 指出 pinv
的源代码。
我不确定我是否 100% 理解 MATLAB 代码,但我认为这归结为 tol
的使用方式不同。 R中Positive
的MATLAB对应为:
`r = sum(s>tol)`
其中tol
是用户提供的;如果提供 none,我们得到:
m = 0;
% I don't get the point of this for loop -- why not just `m = max(size(A))`?
for i = 1:n
m = max(m,length(A(:,i)));
end
% contrast with simply `tol * Xsvd$d[1L]` in R
% (note: i believe the elements of d are sorted largest to smallest)
tol = m*eps(max(s));