R cdplot() - 右轴显示概率还是密度?

R cdplot() - does the right axis show probability or density?

可重现的数据:

## NASA space shuttle o-ring failures
fail <- factor(c(2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1,
                 1, 2, 1, 1, 1, 1, 1),
               levels = 1:2, labels = c("no", "yes"))
temperature <- c(53, 57, 58, 63, 66, 67, 67, 67, 68, 69, 70, 70,
                 70, 70, 72, 73, 75, 75, 76, 76, 78, 79, 81)

## CD plot
cdplot(fail ~ temperature)

cdplot 的文档说:

cdplot computes the conditional densities of x given the levels of y weighted by the marginal distribution of y. The densities are derived cumulatively over the levels of y. The conditional probabilities are not derived by discretization (as in the spinogram), but using a smoothing approach via density.The conditional density functions (cumulative over the levels of y) are returned invisibly.

所以在 x = 63 的地块上,y = 0.4(大约)。这个概率,还是概率密度?关于计算的内容、返回的内容和绘制的内容,我对文档感到困惑。

该图显示给定温度下结果的概率

文档中所说的是为温度测量计算标准密度分布,当 fail 为 'no' 时,为温度单独计算密度。如果我们将“否”温度的密度除以所有温度的密度,然后用 'no' 温度的比例对其进行加权,那么我们将得到在给定温度下绘制“否”的概率的估计值.

为了说明情况,让我们看一下 cdplot:

cdplot(fail ~ temperature)

现在让我们手动计算边缘密度的概率并绘图。我们应该得到 near-identical 形状的曲线

all <- density(temperature, from = min(temperature), to = max(temperature))

no  <- density(temperature[fail == "no"], from = min(temperature), 
                 to = max(temperature))

probs <- no$y/all$y * proportions(table(fail))[1]

plot(all$x, 1 - probs, type = "l", ylim = c(0, 1))