KWDual 中的 REBayes 错误 MKS_RES_TERM_STALL

REBayes Error in KWDual MKS_RES_TERM_STALL

我正在尝试运行下面的模拟。请注意,这确实需要安装 Mosek 和 RMosek!

我一直收到错误

Error in KWDual(A, d, w, ...) : Mosek error: MSK_RES_TRM_STALL: The optimizer is terminated due to slow progress.

如何解决 MSK_RES_TRM_STALL 错误?

进一步研究

在为此查找文档时,我发现了这个:

The optimizer is terminated due to slow progress. Stalling means that numerical problems prevent the optimizer from making reasonable progress and that it makes no sense to continue. In many cases this happens if the problem is badly scaled or otherwise ill-conditioned. There is no guarantee that the solution will be feasible or optimal. However, often stalling happens near the optimum, and the returned solution may be of good quality. Therefore, it is recommended to check the status of the solution. If the solution status is optimal the solution is most likely good enough for most practical purposes. Please note that if a linear optimization problem is solved using the interior-point optimizer with basis identification turned on, the returned basic solution likely to have high accuracy, even though the optimizer stalled. Some common causes of stalling are a) badly scaled models, b) near feasible or near infeasible problems.

所以我检查了最终值A,但里面什么也没有。我发现如果我将模拟从 1000 更改为 30,我确实会得到值 (A <- sim1(30, 30, setting = 1)),但这是次优的。

可重现脚本

KFE <- function(y, T = 300, lambda = 1/3){
    # Kernel Fourier Estimator: Stefanski and Carroll (Statistics, 1990)
    ks <- function(s,x) exp(s^2/2) * cos(s * x)
    K <- function(t, y, lambda = 1/3){
    k <- y
    for(i in 1:length(y)){
        k[i] <- integrate(ks, 0, 1/lambda, x = (y[i] - t))$value/pi 
    }
    mean(k)
    }
    eps <- 1e-04
    if(length(T) == 1) T <- seq(min(y)-eps, max(y)+eps, length = T)
    g <- T
    for(j in 1:length(T))
    g[j] <- K(T[j], y, lambda = lambda)
    list(x = T, y = g)
}
BDE <- function(y, T = 300, df = 5, c0 = 1){
    # Bayesian Deconvolution Estimator: Efron (B'ka, 2016)
    require(splines)
    eps <- 1e-04
    if(length(T) == 1) T <- seq(min(y)-eps, max(y)+eps, length = T)
    X <- ns(T, df = df)
    a0 <- rep(0, ncol(X))
    A <- dnorm(outer(y,T,"-"))
    qmle <- function(a, X, A, c0){
    g <- exp(X %*% a)
    g <- g/sum(g)
    f <- A %*% g
    -sum(log(f)) + c0 * sum(a^2)^.5
    }
    ahat <- nlm(qmle, a0, X=X, A=A, c0 = c0)$estimate
    g <- exp(X %*% ahat)
    g <- g/integrate(approxfun(T,g),min(T),max(T))$value
    list(x = T,y = g)
}
W <- function(G, h, interp = FALSE, eps = 0.001){
    #Wasserstein distance:  ||G-H||_W
    H <- cumsum(h$y)
    H <- H/H[length(H)]
    W <- integrate(approxfun(h$x, abs(G(h$x) - H)),min(h$x),max(h$x))$value
    list(W=W, H=H)
}

biweight <- function(x0, x, bw){
    t <- (x - x0)/bw
    (1-t^2)^2*((t> -1 & t<1)-0) *15/16
}
Wasser <- function(G, h, interp = FALSE, eps = 0.001, bw = 0.7){
    #Wasserstein distance:  ||G-H||_W
    if(interp == "biweight"){
    yk = h$x
    for (j in 1:length(yk))
        yk[j] = sum(biweight(h$x[j], h$x, bw = bw)*h$y/sum(h$y))
    H <- cumsum(yk)
    H <- H/H[length(H)]
    }
    else {
    H <- cumsum(h$y)
    H <- H/H[length(H)]
    }
    W <- integrate(approxfun(h$x, abs(G(h$x) - H)),min(h$x),max(h$x), 
           rel.tol = 0.001, subdivisions = 500)$value
    list(W=W, H=H)
}

sim1 <- function(n, R = 10, setting = 0){
    A <- matrix(0, 4, R)
    if(setting == 0){
    G0 <- function(t) punif(t,0,6)/8 + 7 * pnorm(t, 0, 0.5)/8  
    rf0 <- function(n){
        s <- sample(0:1, n, replace = TRUE, prob = c(1,7)/8)
        rnorm(n) + (1-s) * runif(n,0,6) + s * rnorm(n,0,0.5)
    }
    }
    else{   
    G0 <- function(t) 0 + 7 * (t > 0)/8 + (t > 2)/8
    rf0 <- function(n){
        s <- sample(0:1, n, replace = TRUE, prob = c(1,7)/8)
        rnorm(n) + (1-s) * 2 + s * 0
    }
    }
    for(i in 1:R){
    y <- rf0(n)
    g <- BDE(y)
    Wg <- Wasser(G0, g)
    h <- GLmix(y)
    Wh <- Wasser(G0, h)
    Whs <- Wasser(G0, h, interp = "biweight")
    k <- KFE(y)
    Wk <- Wasser(G0, k)
    A[,i] <- c(Wg$W, Wk$W, Wh$W, Whs$W)
    }
    A
}
require(REBayes)
set.seed(12)
A <- sim1(1000, 1000, setting = 1)


我 运行 代码确实在最后停滞了,但解决方案并不比前面没有停滞的情况更糟糕:

17  1.7e-07  3.1e-10  6.8e-12  1.00e+00   5.345949918e+00   5.345949582e+00   2.4e-10  0.40  
18  2.6e-08  3.8e-11  2.9e-13  1.00e+00   5.345949389e+00   5.345949348e+00   2.9e-11  0.41  
19  2.6e-08  3.8e-11  2.9e-13  1.00e+00   5.345949389e+00   5.345949348e+00   2.9e-11  0.48  
20  2.6e-08  3.8e-11  2.9e-13  1.00e+00   5.345949389e+00   5.345949348e+00   2.9e-11  0.54  
Optimizer terminated. Time: 0.62    


Interior-point solution summary
  Problem status  : PRIMAL_AND_DUAL_FEASIBLE
  Solution status : OPTIMAL
  Primal.  obj: 5.3459493890e+00    nrm: 6e+00    Viol.  con: 2e-08    var: 0e+00    cones: 4e-09  
  Dual.    obj: 5.3459493482e+00    nrm: 7e-01    Viol.  con: 1e-11    var: 4e-11    cones: 0e+00

目前对我有用的一个快速技巧是在对 GLmix:

的调用中稍微放宽终止时间 运行ces
control <- list()
control$dparam <- list(INTPNT_CO_TOL_REL_GAP=1e-7,INTPNT_CO_TOL_PFEAS=1e-7,INTPNT_CO_TOL_DFEAS=1e-7)
h <- GLmix(y,control=control,verb=5)

正如我在评论中指出的那样,更好的解决方案是不要将停顿终止代码视为 REBayes 包的错误,而是使用解决方案 status/quality。

我修改了 KWDual 的 return 以避免出现此类消息,前提是 来自 Mosek 的状态 sol$itr$solsta 在 CRAN 上的 REBayes v2.2 中是 "Optimal"。