在数据集中拟合多个逻辑增长曲线
Fitting multiple logistic growth curves in a dataset
我有多个县的人口数据,希望尽量减少每个县的重复拟合逻辑增长曲线。
county year pop
lake 1970 69305
lake 1980 104870
lake 1990 152104
lake 2000 210528
lake 2010 297052
marion 1970 69030
marion 1980 122488
marion 1990 194833
marion 2000 258916
marion 2010 331298
seminole 1970 83692
seminole 1980 179752
seminole 1990 287529
seminole 2000 365196
seminole 2010 422718
目前我正在对每个县进行子集化:
lake<-countypop[1:5,2:3]
colnames(lake)<-c("year", "pop")
marion<-countypop[6:10,2:3]
colnames(marion)<-c("year", "pop")
seminole<-countypop[11:15,2:3]
然后使用 SSlogis 绘制每个县的曲线,例如:
lake.model <- nls(pop ~ SSlogis(year, phi1, phi2, phi3, data = lake)))
alpha <- coef(lake.model)
plot(pop ~ year, data = lake, main = "Logistic Growth Model of Lake County",
xlab = "Year", ylab = "Population", xlim = c(1920, 2030),ylim=c(0,1000000))
curve(alpha[1]/(1 + exp(-(x - alpha[2])/alpha[3])), add = T, col = "blue")
我有大约 60 个县,我知道必须有一种更清洁的方法来做到这一点。我如何使用应用函数、循环或其他东西来消除代码中的重复?
试试这个:
pdf("countypop.pdf")
models <- by(countypop, countypop$county, function(x) {
fm <- nls(pop ~ SSlogis(year, phi1, phi2, phi3), data = x)
plot(pop ~ year, x, main = county[1])
lines(fitted(fm) ~ year, x)
fm
})
dev.off()
注:我们用这个作为输入:
countypop <-
structure(list(county = c("lake", "lake", "lake", "lake", "lake",
"marion", "marion", "marion", "marion", "marion", "seminole",
"seminole", "seminole", "seminole", "seminole"), year = c(1970L,
1980L, 1990L, 2000L, 2010L, 1970L, 1980L, 1990L, 2000L, 2010L,
1970L, 1980L, 1990L, 2000L, 2010L), pop = c(69305L, 104870L,
152104L, 210528L, 297052L, 69030L, 122488L, 194833L, 258916L,
331298L, 83692L, 179752L, 287529L, 365196L, 422718L)), .Names = c("county",
"year", "pop"), class = "data.frame", row.names = c(NA, -15L))
我有多个县的人口数据,希望尽量减少每个县的重复拟合逻辑增长曲线。
county year pop
lake 1970 69305
lake 1980 104870
lake 1990 152104
lake 2000 210528
lake 2010 297052
marion 1970 69030
marion 1980 122488
marion 1990 194833
marion 2000 258916
marion 2010 331298
seminole 1970 83692
seminole 1980 179752
seminole 1990 287529
seminole 2000 365196
seminole 2010 422718
目前我正在对每个县进行子集化:
lake<-countypop[1:5,2:3]
colnames(lake)<-c("year", "pop")
marion<-countypop[6:10,2:3]
colnames(marion)<-c("year", "pop")
seminole<-countypop[11:15,2:3]
然后使用 SSlogis 绘制每个县的曲线,例如:
lake.model <- nls(pop ~ SSlogis(year, phi1, phi2, phi3, data = lake)))
alpha <- coef(lake.model)
plot(pop ~ year, data = lake, main = "Logistic Growth Model of Lake County",
xlab = "Year", ylab = "Population", xlim = c(1920, 2030),ylim=c(0,1000000))
curve(alpha[1]/(1 + exp(-(x - alpha[2])/alpha[3])), add = T, col = "blue")
我有大约 60 个县,我知道必须有一种更清洁的方法来做到这一点。我如何使用应用函数、循环或其他东西来消除代码中的重复?
试试这个:
pdf("countypop.pdf")
models <- by(countypop, countypop$county, function(x) {
fm <- nls(pop ~ SSlogis(year, phi1, phi2, phi3), data = x)
plot(pop ~ year, x, main = county[1])
lines(fitted(fm) ~ year, x)
fm
})
dev.off()
注:我们用这个作为输入:
countypop <-
structure(list(county = c("lake", "lake", "lake", "lake", "lake",
"marion", "marion", "marion", "marion", "marion", "seminole",
"seminole", "seminole", "seminole", "seminole"), year = c(1970L,
1980L, 1990L, 2000L, 2010L, 1970L, 1980L, 1990L, 2000L, 2010L,
1970L, 1980L, 1990L, 2000L, 2010L), pop = c(69305L, 104870L,
152104L, 210528L, 297052L, 69030L, 122488L, 194833L, 258916L,
331298L, 83692L, 179752L, 287529L, 365196L, 422718L)), .Names = c("county",
"year", "pop"), class = "data.frame", row.names = c(NA, -15L))