无法使用 predFit 获取置信区间数据

Cannot use predFit to get confidence interval data

我正在尝试根据我的 nls 模型计算置信区间。我尝试了与这个检查答案相同的代码:

但是我得到一个奇怪的错误:

Error in eval(form[[3]]) : object 'a' not found
4.
eval(form[[3]])
3.
eval(form[[3]])
2.
predFit.nls(gloss.nls, newdata = data.frame(stimulus = seq(0, 
1, by = 0.1)), interval = "confidence", level = 0.9)
1.
predFit(gloss.nls, newdata = data.frame(stimulus = seq(0, 1, 
by = 0.1)), interval = "confidence", level = 0.9)

我几乎使用与上述答案相同的代码,只是数据不同:

gloss.nls <- nls(
                normP ~ a[1]*stimulus^3+a[2]*stimulus,
                data = data.mlds %>% filter(overall == TRUE),
                start = list(a=c(0.4,0.6))
                )

predFit(gloss.nls, newdata = data.frame(stimulus=seq(0, 1, by = 0.1)), interval = "confidence", level= 0.9)

这是我的数据:

id  rank  stimulus  pscale  normP  overall
0   1   0.000   0.0000000   0.00000000  TRUE
0   2   0.125   0.3151757   0.05889716  TRUE
0   3   0.250   0.9225827   0.17240385  TRUE
0   4   0.375   1.4164383   0.26469110  TRUE
0   5   0.500   1.7400011   0.32515557  TRUE
0   6   0.625   2.3531344   0.43973235  TRUE
0   7   0.750   3.1662257   0.59167546  TRUE
0   8   0.875   4.3538122   0.81360082  TRUE
0   9   1.000   5.3512879   1.00000000  TRUE
1   1   0.000   0.0000000   0.00000000  FALSE

简答

尝试

a <- c(0.4,0.6)
predFit(gloss.nls, newdata = data.frame(stimulus=seq(0, 1, by = 0.1)), interval = "confidence", level= 0.9)

长答案

首先,请注意您的模型是 linear in parameters,因此您可以在普通 ols 中估计模型,其中置信区间很简单。

library(tidyverse)
gloss.lm  <-  lm(normP ~ I(stimulus^3)+stimulus,
                data = data.mlds %>% filter(overall == TRUE)  )
predict(gloss.lm, newdata = data.frame(stimulus=seq(0, 1, by = 0.1)), interval = "confidence", level= 0.9)
           fit         lwr        upr
1  0.005554547 -0.02791979 0.03902889
2  0.061136954  0.03572392 0.08654999
3  0.119410056  0.09931945 0.13950067
4  0.183064551  0.16435972 0.20176938
5  0.254791132  0.23459593 0.27498634
6  0.337280497  0.31518226 0.35937873
7  0.433223342  0.41047149 0.45597519
8  0.545310361  0.52354420 0.56707652
9  0.676232250  0.65548488 0.69697962
10 0.828679707  0.80399326 0.85336616
11 1.005343426  0.96738940 1.04329745

如果您坚持使用非线性最小二乘法估计模型,那么

gloss.nls <-  nls(normP ~ a[1]*stimulus^3+a[2]*stimulus,
                data = data.mlds %>% filter(overall == TRUE) ,
                start=list(a=c(.5, .5)) )

烦人的是,predict.nls 似乎没有置信区间计算,所以这不会产生置信区间。

predict(gloss.nls, newdata = data.frame(stimulus=seq(0, 1, by = 0.1)), interval = "confidence", level= 0.9)
 [1] 0.00000000 0.05704647 0.11672200 0.18165566 0.25447650 0.33781360
 [7] 0.43429601 0.54655279 0.67721301 0.82890574 1.00426003

幸运的是,investr::predFit 有一个计算置信区间的实现。

library(investr)
predFit(gloss.nls, interval='prediction', newdata = data.frame(stimulus=seq(0, 1, by = 0.1), confidence=.9))

...但这returns一个错误(您在问题中遇到)。

我没有深入研究 predFit.nls 代码,但 predFit 似乎在后台静默运行 gloss.nls$call,如果它没有找到所需的一切,它 returns 奇怪的错误。在名称空间中创建一个与 a 形状相同的对象即可解决错误。

a <- coef(gloss.nls)
investr::predFit(gloss.nls, interval='prediction', newdata = data.frame(stimulus=seq(0, 1, by = 0.1), confidence=.9))
             fit          lwr        upr
 [1,] 0.00000000 -0.050130071 0.05013007
 [2,] 0.05704647  0.006916398 0.10717654
 [3,] 0.11672200  0.066591929 0.16685207
 [4,] 0.18165566  0.131525585 0.23178573
 [5,] 0.25447650  0.204346430 0.30460657
 [6,] 0.33781360  0.287683526 0.38794367
 [7,] 0.43429601  0.384165935 0.48442608
 [8,] 0.54655279  0.496422720 0.59668286
 [9,] 0.67721301  0.627082944 0.72734309
[10,] 0.82890574  0.778775670 0.87903581
[11,] 1.00426003  0.954129960 1.05439010

有趣的是,a 中的值没有任何区别。尝试,例如a <- c(7500,-100) 你会得到相同的结果。这可能是 investr 中的错误?

a <- c(7500,-100)
predFit(gloss.nls, interval='prediction', newdata = data.frame(stimulus=seq(0, 1, by = 0.1), confidence=.9))
             fit          lwr        upr
 [1,] 0.00000000 -0.050130071 0.05013007
 [2,] 0.05704647  0.006916398 0.10717654
 [3,] 0.11672200  0.066591929 0.16685207
 [4,] 0.18165566  0.131525585 0.23178573
 [5,] 0.25447650  0.204346430 0.30460657
 [6,] 0.33781360  0.287683526 0.38794367
 [7,] 0.43429601  0.384165935 0.48442608
 [8,] 0.54655279  0.496422720 0.59668286
 [9,] 0.67721301  0.627082944 0.72734309
[10,] 0.82890574  0.778775670 0.87903581
[11,] 1.00426003  0.954129960 1.05439010

数据:

data.mlds <- structure(list(id = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L),
    rank = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 1L), stimulus = c(0,
    0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1, 0), pscale = c(0,
    0.3151757, 0.9225827, 1.4164383, 1.7400011, 2.3531344, 3.1662257,
    4.3538122, 5.3512879, 0), normP = c(0, 0.05889716, 0.17240385,
    0.2646911, 0.32515557, 0.43973235, 0.59167546, 0.81360082,
    1, 0), overall = c(TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,
    TRUE, TRUE, FALSE)), row.names = c(NA, -10L), class = "data.frame")