如何借助 lmomco 函数在 R 中为 fitdistr 函数定义自己的分布

how to define your own distribution for fitdistr function in R with the help of lmomco function

我想定义我自己的分布以与 fitdistrplus 函数一起使用,以拟合我从现在起的每月降水数据,称为 "month"。我正在使用“lmomco”功能来帮助我定义分布,但无法让它工作。例如,我正在定义广义极值 (gev) 分布,如下所示:

dgev<-pdfgev   #functions which are included in lmomco
 pgev<-cdfgev
qgev<-quagev

由于 "fitdistrplus" 需要参数 "start",它由所需分布的初始参数值组成,我将这些初始值估算如下:

lmom=lmoms(month,nmom=5)     #from lmomco package
para=pargev(lmom, checklmom=TRUE)

现在,我终于尝试使用“fitdist”函数将“month”拟合为 gev 分布:

fitgev <- fitdist(month, "gev", start=para[2]) #fitdistrplus

我收到如下错误。无论我在“lmomco”的帮助下定义哪个发行版,我都会得到同样的错误。有人可以提示我做错了什么吗?谢谢!

fitgev <- fitdist(month, "gev", start=para[2])
[1] "Error in dgev(c(27.6, 97.9, 100.6, 107.3, 108.5, 109, 112.4, 120.9, 137.8,  : \n  unused arguments (para.xi = 196.19347977195, para.alpha = 91.9579520442104, para.kappa = -0.00762962879097294)\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in dgev(c(27.6, 97.9, 100.6, 107.3, 108.5, 109, 112.4, 120.9, 137.8, 138.4, 144.7, 156.8, 163.1, 168.9, 169.1, 171.4, 176.1, 177.1, 178.8, 178.9, 187.2, 190.2, 190.5, 190.8, 191.2, 193.1, 195.2, 198.5, 199.8, 201.7, 206.9, 213.4, 220.7, 240, 253.5, 254.5, 256.1, 256.4, 257.5, 258.3, 261.5, 263.7, 264.7, 279.1, 284.2, 313.1, 314.7, 319.4, 321.6, 328.9, 330.1, 332.2, 358.3, 366.8, 367.9, 403.5, 424.1, 425.9, 457.3, 459.7, 468, 497.1, 508.5, 547.1), para.xi = 196.19347977195, para.alpha = 91.9579520442104,     para.kappa = -0.00762962879097294): unused arguments (para.xi = 196.19347977195, para.alpha = 91.9579520442104, para.kappa = -0.00762962879097294)>
Error in fitdist(month, "gev", start = para[2]) : 
  the function mle failed to estimate the parameters, 
                with the error code 100

fitdist 需要一个带有命名参数的 density/distribution 函数。

library("lmomco")
library("fitdistrplus")
## reproducible:
month <- c(27.6, 97.9, 100.6, 107.3, 108.5,
          109, 112.4, 120.9, 137.8)

设置:

lmom <- lmoms(month,nmom=5)     #from lmomco package
para <- pargev(lmom, checklmom=TRUE)
dgev <- pdfgev   #functions which are included in lmomco
pgev <- cdfgev
fitgev <- fitdist(month, "gev", start=para[[2]])
## Error in mledist(data, distname, start, fix.arg, ...) : 
##   'start' must specify names which are arguments to 'distr'

事实证明,我们需要重新定义 dgev,添加一些额外的管道,这将使 fitdistpdfgev 都满意:

 dgev <- function(x,xi,alpha,kappa) {
    pdfgev(x,list(type="gev",para=c(xi,alpha,kappa),source="pargev"))
}
fitgev <- fitdist(month, "gev", start=para[[2]])               
## Fitting of the distribution ' gev ' by maximum likelihood 
## Parameters:
##         estimate Std. Error
## xi    -25.587734         NA
## alpha  75.009842         NA
## kappa   1.593815         NA

确保 cumulative 函数中的参数有变量 q: pgev(q, par1, par2) 而不是 pgev(x, par1, par2)

因为错误消息本质上告诉你它找不到变量 q。

重点是使用: x 作为 pdf 输入 ;q 作为 cdf 输入 ;p 作为逆 cdf 输入

比如拟合一个自己定义的Gumble Distribution

# Data

x1 <- c(6.4,13.3,4.1,1.3,14.1,10.6,9.9,9.6,15.3,22.1,13.4,
13.2,8.4,6.3,8.9,5.2,10.9,14.4)

# Define pdf, cdf , inverse cdf

dgumbel <- function(x,a,b) 1/b*exp((a-x)/b)*exp(-exp((a-x)/b))
pgumbel <- function(q,a,b) exp(-exp((a-q)/b))
qgumbel <- function(p,a,b) a-b*log(-log(p))

# Fit with MLE
   f1c <- fitdist(x1,"gumbel",start=list(a=10,b=5))

# Goodness of Fit
   gofstat(f1c, fitnames = 'Gumbel MLE')

参考: https://www.rdocumentation.org/packages/fitdistrplus/versions/0.2-1/topics/fitdist