用于非整数计数的泊松 GLM - R

Poisson GLM for non-integer counts - R

我希望得到一些关于 Poisson 族的 GLM 的建议。

我有一个数据集,其中包含每个人在一段时间内被咬的次数。由于观察到的个体在不同的时间段进食,当我计算每个个体的咬合率为 bites/minute 时,我得到的是非整数。现在,根据我目前所读的内容,我应该仍然可以对泊松族进行 GLM。但是,我遇到了错误,我认为这可能是因为 R 不喜欢我使用非整数这一事实。有人有建议吗?

Example <- structure(list(Species = c("Fish1", "Fish2", "Fish3", "Fish4", 
"Fish5", "Fish6", "Fish7", "Fish1", "Fish2", "Fish3", "Fish4", 
"Fish5", "Fish6", "Fish7", "Fish1", "Fish2", "Fish3", "Fish4", 
"Fish5", "Fish6", "Fish7"), Site = c(1, 1, 1, 1, 1, 1, 1, 2, 
2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3), Bite_Rate = c(3.5, 7.5, 
0, 0, 2.45, 5.5, 6.5, 6.5, 7.5, 8.03, 32.1, 15.6, 18.2, 19.1, 
20.5, 20.5, 3.5, 5.7, 6.7, 23.2, 0)), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -21L), spec = structure(list(
   cols = list(Species = structure(list(), class = c("collector_character", 
   "collector")), Site = structure(list(), class = c("collector_double", 
   "collector")), Bite_Rate = structure(list(), class = c("collector_double", 
   "collector"))), default = structure(list(), class = c("collector_guess", 
   "collector")), skip = 1), class = "col_spec"))

str(Example) # check structure 
Example$Species<-as.factor(Example$Species) # set species as a factor 
str(Example) # check structure 
glm<-glm(Species~Bite_Rate, data=Example, family = poisson) # create the GLM

我运行GLM时的错误信息是:

Error in if (any(y < 0)) stop("negative values not allowed for the 'Poisson' family") : 
  missing value where TRUE/FALSE needed
In addition: Warning message:
In Ops.factor(y, 0) : ‘<’ not meaningful for factors 

我实际上没有任何负值,这让我有点失望。 任何建议将不胜感激!

编辑: 根据评论,我更新了我的示例数据,以便它具有咬合计数和以秒为单位的观察时间


Example <- structure(list(Species = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 
7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("Fish1", 
"Fish2", "Fish3", "Fish4", "Fish5", "Fish6", "Fish7"), class = "factor"), 
    Site = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 
    3, 3, 3, 3, 3), Bites = c(0, 10, 18, 17, 6, 0, 1, 0, 19, 
    12, 7, 3, 5, 1, 5, 0, 10, 18, 17, 7, 25), Observed_Seconds = c(50, 
    33, 47, 20, 17, 10, 14, 21, 48, 10, 50, 33, 47, 20, 17, 10, 
    14, 21, 48, 10, 90)), row.names = c(NA, -21L), spec = structure(list(
    cols = list(Species = structure(list(), class = c("collector_character", 
    "collector")), Site = structure(list(), class = c("collector_double", 
    "collector")), Bites = structure(list(), class = c("collector_double", 
    "collector")), Observed_Seconds = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1), class = "col_spec"), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"))

谢谢!

您需要获得标准化的秒数(分母)和实际咬合数(计数)。

接下来包括分钟作为偏移量,注意,您的响应变量在 ~ 的左侧:

fit = glm(Bites ~ Species,offset=log(Observed_Seconds),
family=poisson,data=Example)

大家可以看看总结:

summary(fit)

   Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
SpeciesFish1  -2.8679     0.4472  -6.413 1.42e-10 ***
SpeciesFish2  -1.1436     0.1857  -6.158 7.35e-10 ***
SpeciesFish3  -0.5738     0.1581  -3.629 0.000284 ***
SpeciesFish4  -0.7732     0.1543  -5.011 5.42e-07 ***
SpeciesFish5  -1.3269     0.1961  -6.766 1.33e-11 ***
SpeciesFish6  -1.7198     0.2887  -5.958 2.56e-09 ***
SpeciesFish7  -1.5244     0.1925  -7.921 2.35e-15 ***

看起来很重要,但最好检查一下数据是否过于分散,还包括其他因素(例如站点):

fit_quasi = glm(Bites ~ Species + factor(Site),offset=log(Observed_Seconds),
          family=quasipoisson,data=Example)
summary(fit_quasi)

Coefficients:
              Estimate Std. Error t value Pr(>|t|)  
(Intercept)    -2.9754     1.0713  -2.777   0.0167 *
SpeciesFish2    1.8487     1.1434   1.617   0.1319  
SpeciesFish3    2.2731     1.1152   2.038   0.0642 .
SpeciesFish4    2.1246     1.1205   1.896   0.0823 .
SpeciesFish5    1.3533     1.1604   1.166   0.2662  
SpeciesFish6    1.2754     1.2658   1.008   0.3336  
SpeciesFish7    0.8922     1.1719   0.761   0.4612  
factor(Site)2  -0.2325     0.5132  -0.453   0.6587  
factor(Site)3   0.6118     0.4677   1.308   0.2154  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for quasipoisson family taken to be 5.521045)

如果它服从泊松分布,则色散将在 1 左右,但在这种情况下,它是过度分散的。