如何将函数应用于现有数据框的结果添加？

Question

我正在尝试计算某些比率的置信区间。我正在使用 tidyverse 和 epitools 从 Byar 的方法计算 CI。

我几乎可以肯定做错了什么。

library (tidyverse)
library (epitools)


# here's my made up data

DISEASE = c("Marco Polio","Marco Polio","Marco Polio","Marco Polio","Marco Polio",
            "Mumps","Mumps","Mumps","Mumps","Mumps",
            "Chicky Pox","Chicky Pox","Chicky Pox","Chicky Pox","Chicky Pox")
YEAR = c(2011, 2012, 2013, 2014, 2015,
         2011, 2012, 2013, 2014, 2015,
         2011, 2012, 2013, 2014, 2015)
VALUE = c(82,89,79,51,51,
          79,91,69,89,78,
          71,69,95,61,87)
AREA =c("A", "B","C")

DATA = data.frame(DISEASE, YEAR, VALUE,AREA)


# this is a simplification, I have the population values in another table, which I've merged 
# to give me the dataframe I then apply pois.byar to.
DATA$POPN = ifelse(DATA$AREA == "A",2.5,
              ifelse(DATA$AREA == "B",3,
                     ifelse(DATA$AREA == "C",7,0)))


# this bit calculates the number of things per area
rates<-DATA%>%group_by(DISEASE,AREA,POPN)%>%
  count(AREA)

然后如果我想计算 CI 我认为这可行

rates<-DATA%>%group_by(DISEASE,AREA,POPN)%>%
  count(AREA) %>%
  mutate(pois.byar(rates$n,rates$POPN))

但我明白了

Error in mutate_impl(.data, dots) : 
  Evaluation error: arguments imply differing number of rows: 0, 1.

然而这有效：

pois.byar(rates$n,rates$POPN)

然后说："turn the results of the pois.byar function into a dataframe and then merge back to the original" 似乎很愚蠢。我可能只是为了获取一些数据而尝试过……我不想那样做。这不是正确的做事方式。

非常感谢收到任何建议。我认为这是一个相当基本的问题。这表明我不是坐着学习，而是边做边做。

这就是我想要的疾病年份 n 地区 popn x pt 率 lower upper conf.level

Answer 1

我不清楚你的预期输出对我来说应该是什么。您的评论并没有真正的帮助。最好 明确地 包含您提供的示例数据的预期输出。

这里的问题是pois.byvar returns a data.frame。因此，为了 mutate 能够使用 pois.byvar 的输出，我们需要将 data.frame 存储在 list.

中

这是您的代码的更简洁版本

library(tidyverse)
DATA %>%
    mutate(POPN = case_when(
        AREA == "A" ~ 2.5,
        AREA == "B" ~ 3,
        AREA == "C" ~ 7,
        TRUE ~ 0)) %>%
    group_by(DISEASE,AREA,POPN) %>%
    count(AREA) %>%
    mutate(res = list(pois.byar(n, POPN)))

这将创建一个列 res，其中包含 pois.byar 的 data.frame 输出。

或者您可能希望 unnest list 列将条目扩展到不同的列中？

library(tidyverse)
DATA %>%
    mutate(POPN = case_when(
        AREA == "A" ~ 2.5,
        AREA == "B" ~ 3,
        AREA == "C" ~ 7,
        TRUE ~ 0)) %>%
    group_by(DISEASE,AREA,POPN) %>%
    count(AREA) %>%
    mutate(res = list(pois.byar(n, POPN))) %>%
    unnest()
## A tibble: 9 x 10
## Groups:   DISEASE, AREA, POPN [9]
#  DISEASE     AREA   POPN     n     x    pt  rate  lower upper conf.level
#  <fct>       <fct> <dbl> <int> <int> <dbl> <dbl>  <dbl> <dbl>      <dbl>
#1 Chicky Pox  A       2.5     1     1   2.5 0.4   0.0363 1.86        0.95
#2 Chicky Pox  B       3       2     2   3   0.667 0.133  2.14        0.95
#3 Chicky Pox  C       7       2     2   7   0.286 0.0570 0.916       0.95
#4 Marco Polio A       2.5     2     2   2.5 0.8   0.160  2.56        0.95
#5 Marco Polio B       3       2     2   3   0.667 0.133  2.14        0.95
#6 Marco Polio C       7       1     1   7   0.143 0.0130 0.666       0.95
#7 Mumps       A       2.5     2     2   2.5 0.8   0.160  2.56        0.95
#8 Mumps       B       3       1     1   3   0.333 0.0302 1.55        0.95
#9 Mumps       C       7       2     2   7   0.286 0.0570 0.916       0.95

如何将函数应用于现有数据框的结果添加？

How to add the results of applying a function to an existing data frame?

r

confidence-interval

dplyr

tidyverse