在 x 轴上找到与 cdf 的交点相对应的两个点

Question

对应的r code如下

theta <- seq(0,1, length = 10)
CD_theta <- function(x, p, n){
  1 - pbinom(x, size = n, prob = p) + 1 / 2 * dbinom(x, size = n, prob = p)
}

然后我把数据画成如下图：

mytheta <- CD_theta(5, theta, 10)
df <- data.frame(theta = theta, mytheta = mytheta)
ggplot(df, aes(x = theta, y = mytheta)) +
  geom_line(size = 1, col = "steelblue") +
  ylab("H(Theta)") +
  xlab("Theta")

对应的图表如下。

如您所见，有两条水平线（红色图形）和两条垂直线（黑色图形）。我需要找到 x 轴上对应于 H(theta) 交点的两个点。

我使用 r 中的 locator() 函数来计算单次迭代的两个 x 截距。我想重复以上1000次，所以单独找到它们真的很乏味。

有没有其他r函数可以用来找到这两个x截取点？

提前致谢。

Answer 1

稍微增加曲线的离散化，这变得相当简单：

theta <- seq(0,1, length = 100) # increase length here for more precision on point locations
CD_theta <- function(x, p, n){
  1 - pbinom(x, size = n, prob = p) + 1 / 2 * dbinom(x, size = n, prob = p)
}

mytheta <- CD_theta(5, theta, 10)
df <- data.frame(theta = theta, mytheta = mytheta)
ggplot(df, aes(x = theta, y = mytheta)) +
  geom_line(size = 1, col = "steelblue") +
  ylab("H(Theta)") +
  xlab("Theta")

points <- data.frame(x=c(theta[which.min(abs(mytheta - .975))], # find which point is the nearer
                         theta[which.min(abs(mytheta - .025))]),
                     y=c(.975,.025))

ggplot(df, aes(x = theta, y = mytheta)) +
  geom_line(size = 1, col = "steelblue") +
  ylab("H(Theta)") +
  xlab("Theta") + 
  geom_point(data=points,aes(y=y, x=x), size=5, col="red")

Answer 2

这是使用 optimize 函数的数值方法：

library(reprex)

theta <- seq(0,1, length = 10)
CD_theta <- function(x, p, n){
  1 - pbinom(x, size = n, prob = p) + 1 / 2 * dbinom(x, size = n, prob = p)
}

# Create a function to compute the difference between the "y" you want 
# and the "y" computed with CD_theta function 
# then try to minimize the output of this new function : 
# get the theta value corresponding to this "y"

my_fn <- function(theta_loc, y, x, n) { 
  # the function to optimize
  abs(y - CD_theta(x, theta_loc, n)) # dont forget abs (absolute)
}

# Then use optimize function to compute the theta value 
# between a given interval : c(0,1) in this case
# Note that you can directly modify here the values of y, x and n
v1 <- optimize(my_fn, c(0, 1), y = 0.025, x = 5, n = 10)$`minimum` 
v2 <- optimize(my_fn, c(0, 1), y = 0.975, x = 5, n = 10)$`minimum` 

# print the results
v1 # 0.025
#> [1] 0.2120079
v2 # 0.975
#> [1] 0.7879756

由 reprex package (v0.2.0) 创建于 2018-09-21。

Answer 3

如果你想找到精确的 Theta 和 HTheta 值，独立于网格大小（这里 N = 10），应用 uniroot 到 CD_theta函数。

CD_theta <- function(x, p, n) {
    1  -  pbinom (x, size = n, prob = p) + 
    1/2 * dbinom(x, size = n, prob = p)
}

u1 = uniroot(function(p) CD_theta(5, p, 10) - 0.025, c(0, 1))
u2 = uniroot(function(p) CD_theta(5, p, 10) - 0.975, c(0, 1))
(Theta1 = u1$root)  # 0.2119934
(Theta2 = u2$root)  # 0.7880066

但是如果离散化（N = 10）对你很重要，那么在网格点之间对该函数执行线性插值。

theta <- seq(0, 1, length = 10)
mytheta <- CD_theta(5, theta, 10)
f <- approxfun(theta, mytheta, method = "linear", 0.0, 1.0)

u1 = uniroot(function(p) f(p) - 0.025, c(0, 1))
u2 = uniroot(function(p) f(p) - 0.975, c(0, 1))
(Theta1 = u1$root)  # 0.2015638
(Theta2 = u2$root)  # 0.7984362

在 x 轴上找到与 cdf 的交点相对应的两个点

Find the two points on the x-axis corresponding to the intersections for a cdf

plot

r

binomial-cdf

ggplot2