数据系列,我如何在 R 中拟合分布?
Data series, how can i fit a distribution in R?
我对数据系列有一些问题,因为它有一些零值,所以有些分布不适合它。我试过 fitdist
和 fitdistr
函数,但没有一个起作用。有我的数据:
Precp
28
8
0
107
339
231
308
226
302
333
163
92
48
17
101
327
424
338
559
184
238
371
413
261
12
24
103
137
300
446
94
313
402
245
147
70
8
5
59
109,2
493,6
374,5
399,3
330,5
183,8
341,1
91
127,5
15
69
165,8
337,9
255
309,3
352,7
437,5
420,4
295,6
141,7
3,4
16,2
58,9
55,5
203,1
235
300
264,5
320,5
401,5
500,2
149
100
12
110
53,5
70
661,5
86
499,6
154,5
367
142
177
435
64
287,3
210,3
324,7
288,8
0
0
0
0
0
0
0
76,2
53
59,6
176,5
263,1
285,3
423,9
387
367,9
243,9
94
38
50
31
177
180
264
326
204
463,4
255,6
336,4
436,8
139
5
98
180
275,8
415,2
351,7
369
324
249
296
267
102
4
51
123
358,2
394
470
260
288
502
322
597
216
18,9
26
98
311,5
237,5
278
296
387,5
274,2
458,1
0
0
99,6
69,3
152,7
189
319,8
217,9
280,2
250,1
275,2
275
117,5
0
当我尝试拟合分布时,例如 Weibull,这是出现的消息:
> fw=fitdist(Precp,"weibull")
[1] "Error in optim(par = vstart, fn = fnobj, fix.arg = fix.arg, obs = data, : \n non-finite value supplied by optim\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in optim(par = vstart, fn = fnobj, fix.arg = fix.arg, obs = data, ddistnam = ddistname, hessian = TRUE, method = meth, lower = lower, upper = upper, ...): non-finite value supplied by optim>
Error in fitdist(Precp, "weibull") :
the function mle failed to estimate the parameters,
with the error code 100
当我尝试使用伽马分布时,同样的事情发生了。知道那里发生了什么吗?
如果想拟合威布尔分布等极值分布,可以试试evd
包:
library(evd)
> fit <- fgev(dat$Precp)
> fit
Call: fgev(x = dat$Precp)
Deviance: 2159.363
Estimates
loc scale shape
151.9567 137.6544 -0.1518
Standard Errors
loc scale shape
12.41071 9.24535 0.07124
Optimization Information
Convergence: successful
Function Evaluations: 27
Gradient Evaluations: 15
如果您对参数分布不感兴趣,您可以考虑 density
函数,它计算核密度。
由于您的数据似乎包含许多小值,您可以考虑混合两种分布。 flexmix
软件包可以为您做到这一点。
hist(dat$Precp,prob=T,col="gray", ylim=c(0,0.0042), breaks=seq(0,700, by=50)
xlab="", ylab="", main="")
legend("topright",
legend=c("density", "fgev", "flexmix"),
fill=c("darkgreen", "blue", "darkred")
)
xval <- seq(from=0, to=max(dat$Precp), length.out=200)
# density
fit1 <- density(dat$Precp)
lines(fit1, col="darkgreen", lwd=2)
# generalized extreme value distribution
fit2 <- fgev(dat$Precp)
param2 <- fit2$estimate
loc <- param2[["loc"]]
scal <- param2[["scale"]]
shape <- param2[["shape"]]
lines(xval, dgev(xval, loc=loc, scale=scal, shape=shape), col="blue", lwd=2)
# mixture of two Gamma distributions
# http://r.789695.n4.nabble.com/Gamma-mixture-models-with-flexmix-tt3328929.html#none
fit3 <- flexmix(Precp~1, data=subset(dat, Precp>0), k=2,
model = list(FLXMRglm(family = "Gamma"), FLXMRglm(family = "Gamma"))
)
param3 <- parameters(fit3)[[1]] # don't know why this is a list
interc <- param3[1,]
shape <- param3[2,]
lambda <- prior(fit3)
yval <- lambda[[1]]*dgamma(xval, shape=shape[[1]], rate=interc[[1]]*shape[[1]]) +
lambda[[2]]*dgamma(xval, shape=shape[[2]], rate=interc[[2]]*shape[[2]])
lines(xval, yval, col="darkred", lwd=2)
我对数据系列有一些问题,因为它有一些零值,所以有些分布不适合它。我试过 fitdist
和 fitdistr
函数,但没有一个起作用。有我的数据:
Precp
28
8
0
107
339
231
308
226
302
333
163
92
48
17
101
327
424
338
559
184
238
371
413
261
12
24
103
137
300
446
94
313
402
245
147
70
8
5
59
109,2
493,6
374,5
399,3
330,5
183,8
341,1
91
127,5
15
69
165,8
337,9
255
309,3
352,7
437,5
420,4
295,6
141,7
3,4
16,2
58,9
55,5
203,1
235
300
264,5
320,5
401,5
500,2
149
100
12
110
53,5
70
661,5
86
499,6
154,5
367
142
177
435
64
287,3
210,3
324,7
288,8
0
0
0
0
0
0
0
76,2
53
59,6
176,5
263,1
285,3
423,9
387
367,9
243,9
94
38
50
31
177
180
264
326
204
463,4
255,6
336,4
436,8
139
5
98
180
275,8
415,2
351,7
369
324
249
296
267
102
4
51
123
358,2
394
470
260
288
502
322
597
216
18,9
26
98
311,5
237,5
278
296
387,5
274,2
458,1
0
0
99,6
69,3
152,7
189
319,8
217,9
280,2
250,1
275,2
275
117,5
0
当我尝试拟合分布时,例如 Weibull,这是出现的消息:
> fw=fitdist(Precp,"weibull")
[1] "Error in optim(par = vstart, fn = fnobj, fix.arg = fix.arg, obs = data, : \n non-finite value supplied by optim\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in optim(par = vstart, fn = fnobj, fix.arg = fix.arg, obs = data, ddistnam = ddistname, hessian = TRUE, method = meth, lower = lower, upper = upper, ...): non-finite value supplied by optim>
Error in fitdist(Precp, "weibull") :
the function mle failed to estimate the parameters,
with the error code 100
当我尝试使用伽马分布时,同样的事情发生了。知道那里发生了什么吗?
如果想拟合威布尔分布等极值分布,可以试试evd
包:
library(evd)
> fit <- fgev(dat$Precp)
> fit
Call: fgev(x = dat$Precp)
Deviance: 2159.363
Estimates
loc scale shape
151.9567 137.6544 -0.1518
Standard Errors
loc scale shape
12.41071 9.24535 0.07124
Optimization Information
Convergence: successful
Function Evaluations: 27
Gradient Evaluations: 15
如果您对参数分布不感兴趣,您可以考虑 density
函数,它计算核密度。
由于您的数据似乎包含许多小值,您可以考虑混合两种分布。 flexmix
软件包可以为您做到这一点。
hist(dat$Precp,prob=T,col="gray", ylim=c(0,0.0042), breaks=seq(0,700, by=50)
xlab="", ylab="", main="")
legend("topright",
legend=c("density", "fgev", "flexmix"),
fill=c("darkgreen", "blue", "darkred")
)
xval <- seq(from=0, to=max(dat$Precp), length.out=200)
# density
fit1 <- density(dat$Precp)
lines(fit1, col="darkgreen", lwd=2)
# generalized extreme value distribution
fit2 <- fgev(dat$Precp)
param2 <- fit2$estimate
loc <- param2[["loc"]]
scal <- param2[["scale"]]
shape <- param2[["shape"]]
lines(xval, dgev(xval, loc=loc, scale=scal, shape=shape), col="blue", lwd=2)
# mixture of two Gamma distributions
# http://r.789695.n4.nabble.com/Gamma-mixture-models-with-flexmix-tt3328929.html#none
fit3 <- flexmix(Precp~1, data=subset(dat, Precp>0), k=2,
model = list(FLXMRglm(family = "Gamma"), FLXMRglm(family = "Gamma"))
)
param3 <- parameters(fit3)[[1]] # don't know why this is a list
interc <- param3[1,]
shape <- param3[2,]
lambda <- prior(fit3)
yval <- lambda[[1]]*dgamma(xval, shape=shape[[1]], rate=interc[[1]]*shape[[1]]) +
lambda[[2]]*dgamma(xval, shape=shape[[2]], rate=interc[[2]]*shape[[2]])
lines(xval, yval, col="darkred", lwd=2)