R：拟合 S 形曲线？

Question

我正在对我的电费单进行一些分析，并试图提取空调设备的使用量。通过假设“正常”用电量在一年中相当稳定，并且 A/C 仅在较热的月份开启，我能够估计出一些分离。

图中有两行：

线性显示“正常使用”，累积
表示 AC 使用的 S 形曲线，累积

在尝试通过这种总累积使用来拟合一个好的模型时，R 中的 nls 函数在初始参数估计处给出了可怕的 奇异梯度矩阵。使用不同的软件 (JMP) 我可以获得良好的初始参数，这反映在紫色拟合曲线中。

知道如何使用 R 获得参数估计而不会出现错误吗？

nls 模型

nls(elec_use_mean ~ b0 + b1 * month + a1 / (a2 + a3 * exp(a4 * month)), start = list(b0 = 100, b1 = 310, a1 = 1300, a2 = 0.3, a3 = 600, a4 = -1.2), data = cumulative ) Error in nlsModel(formula, mf, start, wts, scaleOffset = scOff, nDcentral = nDcntr) : singular gradient matrix at initial parameter estimates

数据：

month,elec_use_mean 1,461.46 2,839.46 3,1197.92 4,1553.59 5,2093.34 6,3096.42 7,4353.67 8,5652.51 9,6729.84 10,7296.92 11,7634.34 12,8071.84

下面是图片

Answer 1

另一种参数化 S 形曲线的方法是使用 tanh，它可以生成一个只有 4 个参数而不是 6 个参数的很好的收敛模型。公式也可以这样写具有明确含义的参数有助于关闭初始估计：

elec_use_mean ~ b1 * month + 6 * b0 * (1 + tanh(a1 * (month - a0)))

其中：

b0 是一年中每月的平均空调支出
b1 是全年平均每月背景电费支出
a0是一年中空调使用率最高的月份
a1 是衡量“季节性”空调使用情况的指标，其中 0 根本不是季节性的（每个月使用空调总量的 1/12）， 1 表示大约 76% 的空调使用是在最热的两个月。

model <- nls(elec_use_mean ~ b1 * month + 6 * b0 * (1 + tanh(a1 * (month - a0))),
    start = list(b0 = 330, b1 = 300, a0 = 7, a1 = 0.5),
    data = cumulative
    )

summary(model)
#> 
#> Formula: elec_use_mean ~ b1 * month + 6 * b0 * (1 + tanh(a1 * (month - 
#>     a0)))
#> 
#> Parameters:
#>     Estimate Std. Error t value Pr(>|t|)    
#> b0 294.99540   16.88446   17.47 1.18e-07 ***
#> b1 379.93397   16.34690   23.24 1.25e-08 ***
#> a0   7.07014    0.05877  120.31 2.55e-14 ***
#> a1   0.61313    0.05616   10.92 4.39e-06 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 70.82 on 8 degrees of freedom
#> 
#> Number of iterations to convergence: 8 
#> Achieved convergence tolerance: 2.64e-06

通过生成一组平滑的预测，我们可以看到这非常适合：

library(ggplot2)

new_df <- data.frame(month = seq(1, 12, 0.1))
new_df$elec_use_mean <- predict(model, new_df)

ggplot(cumulative, aes(month, elec_use_mean)) +
  geom_point() +
  geom_line(data = new_df, linetype = 2) +
  scale_x_continuous(breaks = 1:12, labels = month.abb)

^{由 reprex package (v2.0.1)}

于 2022-04-21 创建

Answer 2

谢谢大家。在Allan的帮助下，我得到了一个很好的答案。

有关详细信息，请参阅 https://github.com/robhanssen/utility/blob/main/analysis/cumulative_model%20alt.r

最终输出：

R：拟合 S 形曲线？

R: fitting a sigmoidal curve?

r

non-linear-regression