查找覆盖非对称曲线下 95% 面积的区间

Finding an interval that covers 95% area under an asymmetrical curve

我想从称为 posterior 的曲线的模式(红色垂直线)向尾部移动,并在 posterior 的 95% 区域被覆盖时停止。我的愿望是找到可以做到这一点的最短间隔(以 x 轴为单位)。想要这样的区间的两个极限值?

注意:第一种方案我试过了。但该解决方案不适用于当前的这个问题!

P.S. 注意我的 posterior 曲线不是对称的。因此,最短的 95% 是最好的选择。

这是我的功能:

     prior = function(x) dnorm(x)
likelihood = function(x) dt(1.46, 19, x*sqrt(20))
 posterior = function(x) prior(x)*likelihood(x)

mode = optimize(posterior, interval = c(-2, 2), maximum = TRUE, tol = 1e-12)[[1]]
curve(posterior, -2, 2, n = 1e4)
abline(v = mode, col = 2)

我相信解决这个问题的方法类似于 coda::HPDinterval 中的方法(适用于密度);从曲线的顶点开始,向下移动一条水平线;对于每个级别,反转曲线的两半以找到交点;测量交点之间的面积。

设置:

prior = function(x) dnorm(x)
likelihood = function(x) dt(1.46, 19, x*sqrt(20))
posterior = function(x) prior(x)*likelihood(x)

mode = optimize(posterior, interval = c(-2, 2), maximum = TRUE, tol = 1e-12)[[1]]
curve(posterior, -2, 2, n = 1e4)
abline(v = mode, col = 2)

后验分布的逆函数,一次一侧:

inverse.posterior <- function(x,side="left") {
  target <- function(y) posterior(y)-x
  ur <- switch(side,
    left=uniroot(target,interval=c(-2,mode)),
    right=uniroot(target,interval=c(mode,2)))
  return(ur$root)
}

i1 <- inverse.posterior(0.07,"left")
i2 <- inverse.posterior(0.07,"right")
abline(h=0.07,col="gray")
abline(v=c(i1,i2),col="gray")

计算给定水平截断对应的面积:

areafun <- function(h) {
  i1 <- inverse.posterior(h,"left")
  i2 <- inverse.posterior(h,"right")
  return(integrate(posterior,i1,i2)$value)
}

areafun(0.07)

找到给出特定密度分数的高度:

post.area <- integrate(posterior,-2,2)$value
find.lims <- function(a) {
  ur <- uniroot(function(h) areafun(h)/post.area-a,
       c(0.01,posterior(mode)-0.01))
  return(ur$root)
}

试试看:

f <- find.lims(0.95)
## critical height = 0.02129
lwr <- inverse.posterior(f,"left")  ## -0.124
upr <- inverse.posterior(f,"right") ## 0.753
integrate(posterior,lwr,upr)$value/post.area ## 0.9499

对于你的第二个问题(柯西),我决定将我的解决方案封装到一个函数中。 tl;dr 如果您将限制设置得足够宽,它就会起作用。

get.HPDinterval <- function(posterior,lwr=-2,upr=2,level=0.95,eps=0.001) {
   mode = optimize(posterior, interval = c(lwr, upr), maximum = TRUE, tol = 1e-12)[[1]]
  inverse.posterior <- function(x,side="left") {
    target <- function(y) posterior(y)-x
    ur <- switch(side,
                 left=try(uniroot(target,interval=c(lwr,mode))),
                 right=try(uniroot(target,interval=c(mode,upr))))
    if (inherits(ur,"try-error")) stop("inverse.posterior failed: extend limits?")
    return(ur$root)
  }
  areafun <- function(h) {
    i1 <- inverse.posterior(h,"left")
    i2 <- inverse.posterior(h,"right")
    return(integrate(posterior,i1,i2)$value)
  }
  post.area <- integrate(posterior,lwr,upr)$value
  if (post.area<level) stop("limits don't encompass desired area: a=",round(post.area,3))
  find.lims <- function(a) {
     ur <- uniroot(function(h) areafun(h)/post.area-a,
                   c(eps,posterior(mode)-eps))
  return(ur$root)
  }
  f <- find.lims(level)
  return(c(inverse.posterior(f,"left"),
           inverse.posterior(f,"right")))
}

get.HPDinterval(posterior)

posterior2 = function(x) dcauchy(x)
get.HPDinterval(posterior2,-10,10)  ## limits don't encompass desired area
get.HPDinterval(posterior2,-15,15)  ## inverse.posterior failed: extend limits?
get.HPDinterval(posterior2,-20,20)  ## -7.83993 7.83993