如何找到适合 R 上一系列点的曲线?

How to find a curve that fits a series of points on the R?

我需要找出适应某种疾病每天污染的功率曲线方程,以便进行预测,数据如下:

Day     Contaminated

26/feb  1
29/feb  2
04/mar  3
05/mar  8
06/mar  13
07/mar  19
08/mar  25
10/mar  34
11/mar  52
12/mar  81
13/mar  98
14/mar  121
15/mar  176
16/mar  234
17/mar  291
18/mar  428
19/mar  621
20/mar  904
21/mar  1128
22/mar  1546
23/mar  1891
24/mar  2201
25/mar  2433

我认为我需要在 R 中进行幂曲线回归(NonLinearRegression),但我不知道如何实现它。

这是使用 data.tableggplot2nls 的方法。

首先,让我们将日期固定为标准格式并转换为整数,以便进行一些计算。

library(data.table)
library(ggplot2)
setDT(data)
data[,Day:= as.Date(Day,"%d/%b")]
data[,Int := as.integer(Day)-min(as.integer(Day))]

然后我们使用 nls 将模型拟合到数据。我们将使用公式 y = a * x ^ b.

nls(formula = Contaminated ~ a * Int ^ b, data,start=list(a=1,b=1))
# Nonlinear regression model
#  model: Contaminated ~ a * Int^b
#   data: data
#        a         b 
#2.272e-05 5.571e+00 
# residual sum-of-squares: 123279
#
#Number of iterations to convergence: 48 
#Achieved convergence tolerance: 7.832e-07

现在我们可以用ggplot查看结果了。

ggplot(data, aes(x=Int,y=Contaminated)) + 
  geom_point() +
  scale_x_continuous(breaks = c(0,10,20), labels = data$Day[data$Int %in% c(0,10,20)]) + xlab("Date") +
  geom_smooth(method="nls", formula = y ~ a * x ^ b,method.args = list(start = c(a=1, b=1)),se=FALSE, linetype = 1)

数据

data <- structure(list(Day = c("26/feb", "29/feb", "04/mar", "05/mar", 
"06/mar", "07/mar", "08/mar", "10/mar", "11/mar", "12/mar", "13/mar", 
"14/mar", "15/mar", "16/mar", "17/mar", "18/mar", "19/mar", "20/mar", 
"21/mar", "22/mar", "23/mar", "24/mar", "25/mar"), Contaminated = c(1L, 
2L, 3L, 8L, 13L, 19L, 25L, 34L, 52L, 81L, 98L, 121L, 176L, 234L, 
291L, 428L, 621L, 904L, 1128L, 1546L, 1891L, 2201L, 2433L)), class = "data.frame", row.names = c(NA, 
-23L))