nls 在某些数据子集上失败，但在其他类似的子集上没有

Question

我正在尝试按年份对数据应用 nls 函数，因此每年都会有一个单独的 nls 函数。所有年份都大致相似（指数衰减），但有些年份 nls() 函数失败并出现 "singular gradient" 错误。

有效的数据：

good_data = data.frame(y = c(8.46,6.87,5.81,6.62,5.85,5.79,4.83,4.94,4.95,5.27,5.05,5.38,5.08,3.98),
                       x = c(2,6,6,7,7,8,9,10,12,13,14,15,16,17))

失败的数据：

bad_data = data.frame(y = c(8.99,5.86,5.32,5.74,5.41,5.04,4.66,4.52,4.18,4.66,5.38,5.46,5.21,5.37,4.89),
                      x = c(2,6,6,7,7,8,9,10,11,12,13,14,15,16,17))

已尝试 nls：

fit = nls(y ~ SSasymp(x, Asym, R0, lrc), data = good_data)

在我看来，两组数据看起来非常相似。有什么方法可以诊断为什么一个失败而另一个没有失败？我可以做些什么来修复它吗？

谢谢

Answer 1

下面我们展示了 2 种方法。如果您想自动执行此操作，您可能想尝试直接拟合，如果失败则尝试 (2)，如果失败则尝试 (1)。如果它们都失败了，那么数据可能不会真正遵循模型并且不应该与其相符。

另一种可能避免在数据都足够相似的情况下对不同方法进行迭代尝试的可能性是首先拟合所有数据，然后使用来自该数据集的起始值拟合每个数据集。参见 (3)。

1) 如果先通过样条拟合添加更多的点，那么它会收敛：

sp <- with(bad_data, spline(x, y))
fit2sp <- nls(y ~ SSasymp(x, Asym, R0, lrc), data = sp)
fit2sp

给予：

Nonlinear regression model
  model: y ~ SSasymp(x, Asym, R0, lrc)
   data: sp
   Asym      R0     lrc 
 5.0101 22.1915 -0.2958 
 residual sum-of-squares: 5.365

Number of iterations to convergence: 0 
Achieved convergence tolerance: 1.442e-06

2) 如果数据相似，另一种方法是使用先前成功拟合的起始值。

fit1 <- nls(y ~ SSasymp(x, Asym, R0, lrc), data = good_data)
fit2 <- nls(y ~ SSasymp(x, Asym, R0, lrc), data = bad_data, start = coef(fit1))
fit2

给予：

Nonlinear regression model
  model: y ~ SSasymp(x, Asym, R0, lrc)
   data: bad_data
   Asym      R0     lrc 
 4.9379 15.5472 -0.7369 
 residual sum-of-squares: 2.245

Number of iterations to convergence: 10 
Achieved convergence tolerance: 7.456e-06

下面我们绘制了两个解决方案：

plot(y ~ x, bad_data)
points(y ~ x, sp, pch = 20)
lines(fitted(fit2sp) ~ x, sp, col = "red")
lines(fitted(fit2) ~ x, bad_data, col = "blue", lty = 2)
legend("topright", c("data", "spline", "fit2sp", "fit2"), 
  pch = c(1, 20, NA, NA), lty = c(NA, NA, 1, 2), 
  col = c("black", "black", "red", "blue"))

3) 如果所有数据足够相似，另一种可能有效的方法是拟合所有数据，然后使用所有数据的起始值拟合各个数据集。

all_data <- rbind(good_data, bad_data)
fitall <- nls(y ~ SSasymp(x, Asym, R0, lrc), data = all_data)
fit1a <- nls(y ~ SSasymp(x, Asym, R0, lrc), data = good_data, start = coef(fitall))
fit2a <- nls(y ~ SSasymp(x, Asym, R0, lrc), data = bad_data, start = coef(fitall))

nls 在某些数据子集上失败，但在其他类似的子集上没有

nls failing on some subsets of data, but not on other, similar, subsets

r

nls