如何从高斯 GLM 模型中提取标准误差?
How do i extract the standard error from a gaussian GLM model?
我想从我的高斯 GLM 和我的泊松 GLM 中提取标准误差,如果有人知道我该怎么做吗?
下面是模拟数据和两个模型的代码;
#data simulated before fitting models
set.seed(20220520)
#simulating 200 values between 0 and 1 from a uniform distribution
x = runif(200, min = 0, max = 1)
lam = exp(0.3+5*x)
y = rpois(200, lambda = lam)
#before we do this each Yi may contain zeros so we need to add a small constant
y <- y + .1
#combining x and y into a dataframe so we can plot
df = data.frame(x, y)
#Gausian GLM
model1 <- glm(y ~ x,
data = df,
family = gaussian(link = 'log'))
#Poisson GLM
model2 <- glm(y ~ x,
data = df,
family = poisson(link='log'))
这个问题比原本可能出现的要深一些。一般来说,sigma()
会提取残差标准差:
Extract the estimated standard deviation of the errors, the
“residual standard deviation” (misnamed also “residual standard
error”, e.g., in ‘summary.lm()’'s output, from a fitted model).
Many classical statistical models have a scale parameter,
typically the standard deviation of a zero-mean normal (or
Gaussian) random variable which is denoted as sigma. ‘sigma(.)’
extracts the estimated parameter from a fitted model, i.e.,
sigma^.
这对于线性模型 (sigma(model1)
) 是符合预期的。 但是,它不一定能满足您对泊松模型的期望;它 returns 偏差的平方根 除以观察次数,类似于残差标准偏差但不相同。
identical(
sigma(model1), ## 5.424689
sqrt(sum(residuals(model1)^2)/(df.residual(model1)))
) ## TRUE
sigma(model2) ## 1.017891
sqrt(sum(residuals(model2, type="response")^2)/(df.residual(model2))) ## 5.452
(如果您使用 type = "deviance"
[residuals.glm
的默认值] 重做此计算,您 将 得到与 sigma()
...)
如果您想比较拟合优度,您应该考虑像 AIC 这样的指标...
PS 您可能不应该在您的回复中添加 0.1;这不仅是不必要的(对于 log-link 高斯模型或泊松模型),当您拟合泊松模型时,它会导致一系列关于“non-integer x”的警告(在这种情况下无害,但进一步表明你可能不应该这样做);但是,您 do 需要为 log-link 高斯模型指定起始值(start = c(1,1)
似乎可行)。
我想从我的高斯 GLM 和我的泊松 GLM 中提取标准误差,如果有人知道我该怎么做吗?
下面是模拟数据和两个模型的代码;
#data simulated before fitting models
set.seed(20220520)
#simulating 200 values between 0 and 1 from a uniform distribution
x = runif(200, min = 0, max = 1)
lam = exp(0.3+5*x)
y = rpois(200, lambda = lam)
#before we do this each Yi may contain zeros so we need to add a small constant
y <- y + .1
#combining x and y into a dataframe so we can plot
df = data.frame(x, y)
#Gausian GLM
model1 <- glm(y ~ x,
data = df,
family = gaussian(link = 'log'))
#Poisson GLM
model2 <- glm(y ~ x,
data = df,
family = poisson(link='log'))
这个问题比原本可能出现的要深一些。一般来说,sigma()
会提取残差标准差:
Extract the estimated standard deviation of the errors, the “residual standard deviation” (misnamed also “residual standard error”, e.g., in ‘summary.lm()’'s output, from a fitted model).
Many classical statistical models have a scale parameter, typically the standard deviation of a zero-mean normal (or Gaussian) random variable which is denoted as sigma. ‘sigma(.)’ extracts the estimated parameter from a fitted model, i.e., sigma^.
这对于线性模型 (sigma(model1)
) 是符合预期的。 但是,它不一定能满足您对泊松模型的期望;它 returns 偏差的平方根 除以观察次数,类似于残差标准偏差但不相同。
identical(
sigma(model1), ## 5.424689
sqrt(sum(residuals(model1)^2)/(df.residual(model1)))
) ## TRUE
sigma(model2) ## 1.017891
sqrt(sum(residuals(model2, type="response")^2)/(df.residual(model2))) ## 5.452
(如果您使用 type = "deviance"
[residuals.glm
的默认值] 重做此计算,您 将 得到与 sigma()
...)
如果您想比较拟合优度,您应该考虑像 AIC 这样的指标...
PS 您可能不应该在您的回复中添加 0.1;这不仅是不必要的(对于 log-link 高斯模型或泊松模型),当您拟合泊松模型时,它会导致一系列关于“non-integer x”的警告(在这种情况下无害,但进一步表明你可能不应该这样做);但是,您 do 需要为 log-link 高斯模型指定起始值(start = c(1,1)
似乎可行)。