R quantreg 模型不重现分位数：为什么？

Question

我正在使用 quantreg 包来预测分位数及其置信区间。我不明白为什么预测的分位数与使用 quantile().

直接根据数据计算的分位数不同

library(tidyverse)
library(quantreg)

data <- tibble(data=runif(10)*10)
qr1 <- rq(formula=data ~ 1, tau=0.9, data=data) #  quantile regression
yqr1<- predict(qr1, newdata=tibble(data=c(1)), interval='confidence', level=0.95, se='boot') # predict quantile
q90 <- quantile(data$data, 0.9) # quantile of sample

> yqr1
       fit    lower   higher
1 6.999223 3.815588 10.18286
> q90
     90% 
7.270891

Answer 1

您应该意识到，为只有 10 个项目的数据集预测第 90 个百分位数实际上仅基于两个最高值。您应该查看分位数的帮助页面，您会在其中找到该术语的多个定义。

当我运行这个时，我看到：

 yqr1<- predict(qr1, newdata=tibble(data=c(1)) ) 
 yqr1
       1 
8.525812

当我查看数据时，我看到：

data
# A tibble: 10 x 1
         data
        <dbl>
 1 8.52581158
 2 7.73959380
 3 4.53000680
 4 0.03431813
 5 2.13842058
 6 5.60713159
 7 6.17525537
 8 8.76262959
 9 5.30750304
10 4.61817190

所以 rq 函数将第二高的值估计为第 90 个百分位数，这似乎非常合理。 quantile 结果实际上并不是这样估计的：

quantile(data$data, .9)
#     90% 
#8.549493 
?quantile

R quantreg 模型不重现分位数：为什么？

R quantreg model does not reproduce quantiles: Why?

r

quantile

quantreg