如何在 R 中为内部有些复杂的 nls 模型生成起始值?

How to generate starting values for a somewhat internally complex nls model in R?

我正在尝试使用在我的领域中有很多先例的 nls 构建一个有点复杂的模型,但我反复收到错误“评估模型时产生的缺失值或无穷大”。从以前的帖子中我知道这可能是由糟糕的起始值引起的,但我无法生成更高质量的起始值。我的模型采用一般结构:

流出量 = a * e(b * 温度) + ln(c + d*水分) .

我能够为参数 a 和 b 生成强大的起始值,因为加号左边的所有方程都可以拆分成自己的简单方程,两边都可以取对数,然后这些对数转换数据的线性模型可以很容易地从中提取它们的起始参数。但是,我在提取 c 和 d 的参数时遇到了麻烦,因为相同的过程对于完整的方程式几乎不起作用。这是我的代码,试图随意猜测一些 c 和 d 起始参数:


library(dplyr) 


model_efflux <- d_resp_data_2020 %>% 
  group_by(stand, plot) %>% 
  do(model = nls(avefflux ~ a * exp(b * avtemp_data_2020) + log(c + d*(moisture_data_2020)), start = list(a = 0.8, b = 0.1, c = -1, d = 1), data = .)) %>% ungroup()

这里有一些数据可以使用:

,timestamp,timestampfull,date,stand,plot,position,depth,moisture_data_2020,avefflux,avtemp_data_2020
1,9:30,2020-11-16 09:30:00,11/16/2020,ac1911,2,9,d,11.8,0.691,5.85
2,10:00,2020-11-16 10:00:00,11/16/2020,ac1911,2,15,d,14.5,0.455,6.11
3,10:00,2020-11-16 10:00:00,11/16/2020,ac1911,1,3,d,16.7,0.345,5.34
4,10:30,2020-11-16 10:30:00,11/16/2020,ac1911,1,9,d,14.5,1.11,6.44
5,10:30,2020-11-16 10:30:00,11/16/2020,ac1911,1,15,d,9.2,0.861,5.72
6,11:30,2020-11-16 11:30:00,11/16/2020,ac1911,1,3,d,11.4,0.236,5.61
7,12:00,2020-11-16 12:00:00,11/16/2020,ac1911,1,9,d,15.4,1.28,5.53
8,12:00,2020-11-16 12:00:00,11/16/2020,ac1911,1,15,d,12.9,0.651,5.88
9,12:00,2020-11-16 12:00:00,11/16/2020,ac1911,2,3,d,16.1,0.781,5.78
10,12:30,2020-11-16 12:30:00,11/16/2020,ac1911,2,9,d,12.4,0.839,5.71
11,13:00,2020-11-16 13:00:00,11/16/2020,ac1911,2,15,d,15.9,1.295,6.02
12,14:00,2020-11-16 14:00:00,11/16/2020,ac1911,7,3,d,10.1,0.831,5.46
13,14:30,2020-11-16 14:30:00,11/16/2020,ac1911,7,9,d,11.4,0.626,5.75
14,14:30,2020-11-16 14:30:00,11/16/2020,ac1911,7,15,d,14.2,0.686,5.22
15,15:30,2020-11-16 15:30:00,11/16/2020,ac1911,8,3,d,16.7,0.611,5.77
16,15:30,2020-11-16 15:30:00,11/16/2020,ac1911,8,9,d,12.1,0.954,5.11
17,16:00,2020-11-16 16:00:00,11/16/2020,ac1911,8,15,d,7.6,0.709,5.91
18,16:30,2020-11-16 16:30:00,11/16/2020,ac1911,1,3,d,15.9,0.614,6.06
19,16:30,2020-11-16 16:30:00,11/16/2020,ac1911,1,9,d,13.6,0.971,5.96
20,16:30,2020-11-16 16:30:00,11/16/2020,ac1911,1,15,d,15.2,1.235,5.84
21,11:30,2020-11-17 11:30:00,11/17/2020,ac1911,104,3,d,11.8,0.544,4.7
22,11:30,2020-11-17 11:30:00,11/17/2020,ac1911,104,9,d,11.4,0.228,3.68
23,12:00,2020-11-17 12:00:00,11/17/2020,ac1911,104,15,d,10.5,0.657,4.38
24,12:30,2020-11-17 12:30:00,11/17/2020,ac1911,103,3,d,8.8,0.581,3.99
25,12:30,2020-11-17 12:30:00,11/17/2020,ac1911,103,9,d,11.4,0.518,3.99
26,13:00,2020-11-17 13:00:00,11/17/2020,ac1911,103,15,d,10.9,0.418,3.53
27,15:30,2020-11-17 15:30:00,11/17/2020,ac1911,2,3,d,17.2,0.709,3.3
28,15:30,2020-11-17 15:30:00,11/17/2020,ac1911,2,9,d,10.7,0.568,3.47
29,16:00,2020-11-17 16:00:00,11/17/2020,ac1911,2,15,d,9.5,0.949,3.77
30,16:00,2020-11-17 16:00:00,11/17/2020,ac1911,3,3,d,11.9,0.76,4.92
31,16:00,2020-11-17 16:00:00,11/17/2020,ac1911,3,9,d,12.9,0.794,5.17
32,16:30,2020-11-17 16:30:00,11/17/2020,ac1911,3,15,d,8.9,0.602,4.3
33,16:30,2020-11-17 16:30:00,11/17/2020,ac1911,1,3,d,9.6,0.762,3.51
34,16:30,2020-11-17 16:30:00,11/17/2020,ac1911,1,9,d,11.3,0.435,3.16
35,17:00,2020-11-17 17:00:00,11/17/2020,ac1911,1,15,d,11.2,0.631,3.79
36,16:00,2020-11-18 16:00:00,11/18/2020,ac1911,e,3,d,11.2,1.33,4.45
37,16:00,2020-11-18 16:00:00,11/18/2020,ac1911,e,9,d,14.1,2.135,5.21
38,16:00,2020-11-18 16:00:00,11/18/2020,ac1911,e,15,d,10.9,0.949,4.48
39,16:30,2020-11-18 16:30:00,11/18/2020,ac1911,x,3,d,16.9,2.565,5.08
40,16:30,2020-11-18 16:30:00,11/18/2020,ac1911,x,9,d,11.6,0.891,4.6
41,17:00,2020-11-18 17:00:00,11/18/2020,ac1911,x,15,d,10.9,1.145,4.71
42,17:00,2020-11-18 17:00:00,11/18/2020,ac1911,f,3,d,14.4,0.778,5.02
43,17:00,2020-11-18 17:00:00,11/18/2020,ac1911,f,9,d,10.1,1.165,5.87
44,17:30,2020-11-18 17:30:00,11/18/2020,ac1911,f,15,d,10.7,0.726,4.52
45,9:00,2020-11-19 09:00:00,11/19/2020,ac1911,3,3,d,10.6,0.292,2.53
46,9:30,2020-11-19 09:30:00,11/19/2020,ac1911,3,9,d,9.3,0.613,3.46
47,9:30,2020-11-19 09:30:00,11/19/2020,ac1911,3,15,d,9.9,0.438,2.43
48,9:30,2020-11-19 09:30:00,11/19/2020,ac1911,2,3,d,12.4,0.502,3.4
49,10:00,2020-11-19 10:00:00,11/19/2020,ac1911,2,9,d,10.6,0.33,3.08
50,10:00,2020-11-19 10:00:00,11/19/2020,ac1911,2,15,d,9.9,0.326,2.89
51,10:00,2020-11-19 10:00:00,11/19/2020,ac1911,1,3,d,8.8,0.645,3.78
52,10:30,2020-11-19 10:30:00,11/19/2020,ac1911,1,15,d,9,0.609,3.5
53,11:00,2020-11-19 11:00:00,11/19/2020,ac1911,2,3,d,8.8,0.74,4.14
54,11:00,2020-11-19 11:00:00,11/19/2020,ac1911,2,9,d,10.3,0.782,4.58
55,11:00,2020-11-19 11:00:00,11/19/2020,ac1911,2,15,d,10.5,0.735,4.49
56,11:30,2020-11-19 11:30:00,11/19/2020,ac1911,1,3,d,11.9,0.869,4.6
57,11:30,2020-11-19 11:30:00,11/19/2020,ac1911,1,15,d,10.4,1.024,5.09
58,12:00,2020-11-19 12:00:00,11/19/2020,ac1911,3,3,d,11.2,1.205,5.13
59,12:00,2020-11-19 12:00:00,11/19/2020,ac1911,3,9,d,12.6,1.048,4.7
60,12:30,2020-11-19 12:30:00,11/19/2020,ac1911,3,15,d,16.2,0.663,5.21
61,12:30,2020-11-19 12:30:00,11/19/2020,ac1911,1,3,d,14.8,0.928,5.2
62,13:00,2020-11-19 13:00:00,11/19/2020,ac1911,1,9,d,11.9,0.923,5.69
63,13:00,2020-11-19 13:00:00,11/19/2020,bf52,1,15,d,12.9,1.62,5.52
64,13:00,2020-11-19 13:00:00,11/19/2020,bf52,2,3,d,10.8,0.582,5.61
65,13:00,2020-11-19 13:00:00,11/19/2020,bf52,2,9,d,7.8,1.255,5.63
66,13:30,2020-11-19 13:30:00,11/19/2020,bf52,2,15,d,8.3,1.11,5.76
67,13:30,2020-11-19 13:30:00,11/19/2020,bf52,3,3,d,9.7,0.821,5.99
68,13:30,2020-11-19 13:30:00,11/19/2020,bf52,3,9,d,8.8,1.35,5.68
69,15:00,2020-11-19 15:00:00,11/19/2020,bf52,3,15,d,10.3,1.215,5.71
70,15:00,2020-11-19 15:00:00,11/19/2020,bf52,1,3,d,11.8,0.703,6.21
71,15:00,2020-11-19 15:00:00,11/19/2020,bf52,1,9,d,8.6,0.821,6.68
72,15:30,2020-11-19 15:30:00,11/19/2020,bf52,1,15,d,7.3,0.629,6.64
73,15:30,2020-11-19 15:30:00,11/19/2020,bf52,2,3,d,10.5,0.45,5.12
74,15:30,2020-11-19 15:30:00,11/19/2020,bf52,2,9,d,11.2,0.788,5.34
75,16:00,2020-11-19 16:00:00,11/19/2020,bf52,2,15,d,12.9,1.135,5.6
76,16:00,2020-11-19 16:00:00,11/19/2020,bf52,107,3,d,7.7,0.884,6.69
77,16:00,2020-11-19 16:00:00,11/19/2020,bf52,107,9,d,10.5,1.095,6.1
78,16:30,2020-11-19 16:30:00,11/19/2020,bf52,107,15,d,10.5,1.195,6.16
79,10:00,2020-11-20 10:00:00,11/20/2020,bf52,1,3,d,12.9,1.59,6.36
80,10:00,2020-11-20 10:00:00,11/20/2020,bf52,1,9,d,13.2,1.73,6.61
81,10:00,2020-11-20 10:00:00,11/20/2020,bf52,1,15,d,11,2.37,6.46
82,10:30,2020-11-20 10:30:00,11/20/2020,bf52,2,3,d,9.8,1.51,6.76
83,10:30,2020-11-20 10:30:00,11/20/2020,bf52,2,9,d,NA,NA,NA
84,11:00,2020-11-20 11:00:00,11/20/2020,bf52,2,15,d,14.4,0.778,6.53
85,11:30,2020-11-20 11:30:00,11/20/2020,bf52,3,3,d,9.8,0.987,6.02
86,11:30,2020-11-20 11:30:00,11/20/2020,bf52,3,9,d,10.1,1.775,6.54
87,11:30,2020-11-20 11:30:00,11/20/2020,bf52,3,15,d,NA,NA,NA
88,12:30,2020-11-20 12:30:00,11/20/2020,bf52,a,3,d,12.3,0.802,7.45
89,12:30,2020-11-20 12:30:00,11/20/2020,bf52,a,9,d,12.4,1.27,7.36
90,12:30,2020-11-20 12:30:00,11/20/2020,bf52,a,15,d,7.4,1.24,7.46
91,13:00,2020-11-20 13:00:00,11/20/2020,bf52,c,3,d,13.4,1.25,7.75
92,13:00,2020-11-20 13:00:00,11/20/2020,bf52,c,9,d,7.9,1.375,7.86
93,13:00,2020-11-20 13:00:00,11/20/2020,bf52,c,15,d,10.1,0.976,7.3
94,13:30,2020-11-20 13:30:00,11/20/2020,bf52,b,3,d,10.2,1.255,7.28
95,13:30,2020-11-20 13:30:00,11/20/2020,bf52,b,9,d,10.7,1.4,7.26
96,13:30,2020-11-20 13:30:00,11/20/2020,bf52,b,15,d,8.4,1.55,7.5
97,10:10,2020-07-28 10:00:00,20200728,bf52,1,3,d,6.3,3.335,21.33
98,10:24,2020-07-28 10:30:00,20200728,bf52,1,9,d,5.8,6.325,20.82
99,10:31,2020-07-28 10:30:00,20200728,bf52,1,15,d,4.9,3.445,21.23
100,10:42,2020-07-28 10:30:00,20200728,bf52,2,3,d,9.3,2.12,22.55
101,10:52,2020-07-28 11:00:00,20200728,bf52,2,9,d,6.4,4.155,22.52
102,10:59,2020-07-28 11:00:00,20200728,bf52,2,15,d,11.2,6.135,22.08
103,11:16,2020-07-28 11:30:00,20200728,bf52,1,3,d,10.3,4.965,21.89
104,11:57,2020-07-28 12:00:00,20200728,bf52,1,15,d,7.7,4.52,20.99

任何人都可以告诉我为什么会出现此错误,以及获得一些起始值的更好方法吗?

在 minpack.lm 中使用 nlsLM 我们得到以下结果。请注意,虽然此函数比 nls 更有可能提供答案,但它通常以过早收敛为代价。

注意我们在公式中取反了b

下面,我们首先拟合组合模型,然后使用其结果作为起始值来拟合各个组。

我们添加了一个 n 列来显示每组中的行数。

library(dplyr)
library(minpack.lm)

fo <- avefflux ~ a * exp(-b * moisture_data_2020) + log(c + d*avtemp_data_2020)
st <- list(a = 0.8, b = 0.1, c = 0.1, d = 1)
fm <- nlsLM(fo, d_resp_data_2020, start = st)

model_efflux <- d_resp_data_2020 %>% 
  group_by(stand, plot) %>% 
  mutate(n = n()) %>%
  group_by(n, .add = TRUE) %>%
  do(model = nlsLM(fo, data = ., start = coef(fm))) %>%
  ungroup %>%
  rowwise %>%
  mutate(a = coef(model)[1], b = coef(model)[2], c = coef(model)[3],
     d = coef(model)[4]) %>%
  ungroup

给出以下内容。它似乎在很多情况下都使用了整体拟合,而在其他情况下,拟合似乎非常可疑。

> model_efflux
# A tibble: 17 x 8
   stand  plot      n model             a     b       c     d
   <chr>  <chr> <int> <list>        <dbl> <dbl>   <dbl> <dbl>
 1 ac1911 1        18 <nls>         311.  1.02    0.977 0.241
 2 ac1911 103       3 <nls>          30.2 0.563  -0.930 0.677
 3 ac1911 104       3 <nls>          30.2 0.563  -0.930 0.677
 4 ac1911 2        14 <nls>        2642.  1.09    0.838 0.251
 5 ac1911 3         9 <nls>   -38121775.  2.19    0.570 0.371
 6 ac1911 7         3 <nls>          30.2 0.563  -0.930 0.677
 7 ac1911 8         3 <nls>          30.2 0.563  -0.930 0.677
 8 ac1911 e         3 <nls>          30.2 0.563  -0.930 0.677
 9 ac1911 f         3 <nls>          30.2 0.563  -0.930 0.677
10 ac1911 x         3 <nls>          30.2 0.563  -0.930 0.677
11 bf52   1        12 <nls>  -239464524.  4.02  -21.6   4.29 
12 bf52   107       3 <nls>          30.2 0.563  -0.930 0.677
13 bf52   2        12 <nls>       24349.  1.73  -11.3   2.48 
14 bf52   3         6 <nls>   196010890.  2.28   -9.55  2.12 
15 bf52   a         3 <nls>          30.2 0.563  -0.930 0.677
16 bf52   b         3 <nls>          30.2 0.563  -0.930 0.677
17 bf52   c         3 <nls>          30.2 0.563  -0.930 0.677

整体合身度为:

> fm
Nonlinear regression model
  model: avefflux ~ a * exp(-b * moisture_data_2020) + log(c + d * avtemp_data_2020)
   data: d_resp_data_2020
      a       b       c       d 
30.1573  0.5629 -0.9301  0.6775 
 residual sum-of-squares: 45.99

Number of iterations to convergence: 15 
Achieved convergence tolerance: 1.49e-08

nls

我们使用 nls 重做这个。因为上面的 a 值变化很大,我们将 a 值固定为 a 的整体拟合值,然后分别拟合其他值。正如预期的那样,nls 对于小组来说失败了,否则它会给出一个可能无效的答案。

fo <- avefflux ~ a * exp(-b * moisture_data_2020) + log(c + d*avtemp_data_2020)
st <- list(a = 0.8, b = 0.1, c = 0.1, d = 1)
fm <- nls(fo, d_resp_data_2020, start = st)

a <- coef(fm)[1]
model_efflux <- d_resp_data_2020 %>% 
  group_by(stand, plot) %>% 
  mutate(n = n()) %>%
  group_by(n, .add = TRUE) %>%
  do(model = try(nls(fo, data = ., start = coef(fm)[-1]))) %>%
  ungroup %>%
  rowwise %>%
  mutate(a = if (inherits(model, "nls")) a else NA,
         b = if (inherits(model, "nls")) coef(model)[["b"]] else NA,
         c = if (inherits(model, "nls")) coef(model)[["c"]] else NA,
         d = if (inherits(model, "nls")) coef(model)[["d"]] else NA) %>%
  ungroup

> model_efflux
# A tibble: 17 x 8
   stand  plot      n model              a      b       c      d
   <chr>  <chr> <int> <list>         <dbl>  <dbl>   <dbl>  <dbl>
 1 ac1911 1        18 <nls>           30.2  0.774   0.984  0.239
 2 ac1911 103       3 <try-errr [1]>  NA   NA      NA     NA    
 3 ac1911 104       3 <try-errr [1]>  NA   NA      NA     NA    
 4 ac1911 2        14 <nls>           30.2  0.617   0.822  0.252
 5 ac1911 3         9 <try-errr [1]>  NA   NA      NA     NA    
 6 ac1911 7         3 <try-errr [1]>  NA   NA      NA     NA    
 7 ac1911 8         3 <try-errr [1]>  NA   NA      NA     NA    
 8 ac1911 e         3 <try-errr [1]>  NA   NA      NA     NA    
 9 ac1911 f         3 <try-errr [1]>  NA   NA      NA     NA    
10 ac1911 x         3 <try-errr [1]>  NA   NA      NA     NA    
11 bf52   1        12 <try-errr [1]>  NA   NA      NA     NA    
12 bf52   107       3 <try-errr [1]>  NA   NA      NA     NA    
13 bf52   2        12 <nls>           30.2  0.768 -11.6    2.54 
14 bf52   3         6 <nls>           30.2  0.524  -6.22   1.52 
15 bf52   a         3 <try-errr [1]>  NA   NA      NA     NA    
16 bf52   b         3 <try-errr [1]>  NA   NA      NA     NA    
17 bf52   c         3 <try-errr [1]>  NA   NA      NA     NA    

nls 2

这与上一节相同,只是我们将 a 和 b 都固定为整体拟合给出的值。这次所有小组都给出了结果。

fo <- avefflux ~ a * exp(-b * moisture_data_2020) + log(c + d*avtemp_data_2020)
st <- list(a = 0.8, b = 0.1, c = 0.1, d = 1)
fm <- nls(fo, d_resp_data_2020, start = st)

a <- coef(fm)[1]
b <- coef(fm)[2]

model_efflux <- d_resp_data_2020 %>% 
  group_by(stand, plot) %>% 
  mutate(n = n()) %>%
  group_by(n, .add = TRUE) %>%
  do(model = try(nls(fo, data = ., start = coef(fm)[3:4]))) %>%
  ungroup %>%
  rowwise %>%
  mutate(a = if (inherits(model, "nls")) a else NA,
         b = if (inherits(model, "nls")) b else NA,
         c = if (inherits(model, "nls")) coef(model)[["c"]] else NA,
         d = if (inherits(model, "nls")) coef(model)[["d"]] else NA) %>%
  ungroup

> model_efflux
# A tibble: 17 x 8
   stand  plot      n model      a     b       c       d
   <chr>  <chr> <int> <list> <dbl> <dbl>   <dbl>   <dbl>
 1 ac1911 1        18 <nls>   30.2 0.563   0.620   0.291
 2 ac1911 103       3 <nls>   30.2 0.563   0.681   0.210
 3 ac1911 104       3 <nls>   30.2 0.563  -0.803   0.551
 4 ac1911 2        14 <nls>   30.2 0.563   0.727   0.262
 5 ac1911 3         9 <nls>   30.2 0.563   0.330   0.382
 6 ac1911 7         3 <nls>   30.2 0.563   4.10   -0.396
 7 ac1911 8         3 <nls>   30.2 0.563   9.87   -1.43 
 8 ac1911 e         3 <nls>   30.2 0.563 -28.0     6.95 
 9 ac1911 f         3 <nls>   30.2 0.563  -1.33    0.711
10 ac1911 x         3 <nls>   30.2 0.563 -80.8    18.0  
11 bf52   1        12 <nls>   30.2 0.563  -8.21    1.90 
12 bf52   107       3 <nls>   30.2 0.563  16.6    -2.24 
13 bf52   2        12 <nls>   30.2 0.563  -9.17    2.04 
14 bf52   3         6 <nls>   30.2 0.563  -6.24    1.55 
15 bf52   a         3 <nls>   30.2 0.563 102.    -13.4  
16 bf52   b         3 <nls>   30.2 0.563   0.208   0.450
17 bf52   c         3 <nls>   30.2 0.563  -6.31    1.20