r 中的均值和 sd 的频率 table,每行有多个案例
Frequency table with mean and sd in r with multiple cases per row
我想创建一个频率 table,它根据以下(虚拟)数据给出每年 mean
和 SD
的咨询:
id icpc icpc2 date
1: 123 D95 F15 2015-06-19
2: 123 F85 2016-08-15
3: 332 A01 2010-03-16
4: 332 A04 2018-01-20
5: 332 K20 2017-02-20
6: 100 B10 2017-06-01
7: 100 A04 2008-01-11
8: 113 T08 2018-03-18
9: 113 P28 2017-01-19
10: 113 D95 A01 2013-01-16
11: 113 A04 2009-05-01
12: 551 B12 A01 2011-04-03
13: 551 D95 2015-05-09
可重现的数据:
df <- structure(list(id = c(123L, 123L, 332L, 332L, 332L, 100L, 100L,
113L, 113L, 113L, 113L, 551L, 551L), icpc = c("D95", "F85", "A01",
"A04", "K20", "B10", "A04", "T08", "P28", "D95", "A04", "B12",
"D95"), icpc2 = c("F15", "", "", "", "", "", "", "", "", "A01",
"", "A01", ""), date = c("2015-06-19", "2016-08-15", "2010-03-16",
"2018-01-20", "2017-02-20", "2017-05-01", "2008-01-11", "1201803-18",
"2017-01-19", "2013-01-16", "2009-05-01", "2011-04-03", "2015-05-09"
)), class = "data.frame", row.names = c(NA, -13L))
我执行了以下步骤,并且能够通过 mean
获得频率 table,但我认为应该有更简单的方法,但我仍然无法获得 SD
。 请帮我得到 SD
每 year
.
为了算均值,我新建了一个专栏(consult
),每次咨询1个(基于icpc
):
setDT(df)[, consult := if (any(icpc %in% "")) "1" else "1", ]
df$consult <- as.numeric(df$consult)
从那里:
#consultation frequency per year
df.freq.year <- df %>%
mutate(year = format(date, "%Y")) %>%
group_by(id, year) %>%
summarise(frequency = sum(consult))
#mean consultations per year
df.mean.year <- df.freq.year %>%
group_by(id, year) %>%
summarise(mean = mean(frequency))
#make table with number of patients per year
df.pat <- df %>%
mutate(year = format(date, "%Y")) %>%
group_by(year) %>%
summarise(Nbr.patients = sum(length(unique(id))))
我试过以下方法(不成功):
sqrt(var(df.freq.year$frequency, by = "year"))
我的输出应该是这样的:
year mean SD
1: 2008 5.2 1.3
2: 2009 4.0 1.1
3: 2010 8.9 1.6
4: 2011 4.9 2.1
5: 2012 3.4 1.1
6: 2013 2.3 1.1
7: 2014 9.5 1.3
8: 2015 12.0 2.1
9: 2016 11.4 2.6
10: 2017 8.9 2.0
11: 2018 6.7 2.2
好的,我设法解决了...
#consultation frequency per patient per year
df.freq.patyear <- df %>%
group_by(id, year) %>%
summarise(frequency = sum(consult))
#calculate SD per year
df.sd <- df.freq.patyear %>%
group_by(year) %>%
summarise(SD = sd(frequency))
df.table <- merge(df.mean.year, df.sd, by = "year")
我想创建一个频率 table,它根据以下(虚拟)数据给出每年 mean
和 SD
的咨询:
id icpc icpc2 date
1: 123 D95 F15 2015-06-19
2: 123 F85 2016-08-15
3: 332 A01 2010-03-16
4: 332 A04 2018-01-20
5: 332 K20 2017-02-20
6: 100 B10 2017-06-01
7: 100 A04 2008-01-11
8: 113 T08 2018-03-18
9: 113 P28 2017-01-19
10: 113 D95 A01 2013-01-16
11: 113 A04 2009-05-01
12: 551 B12 A01 2011-04-03
13: 551 D95 2015-05-09
可重现的数据:
df <- structure(list(id = c(123L, 123L, 332L, 332L, 332L, 100L, 100L,
113L, 113L, 113L, 113L, 551L, 551L), icpc = c("D95", "F85", "A01",
"A04", "K20", "B10", "A04", "T08", "P28", "D95", "A04", "B12",
"D95"), icpc2 = c("F15", "", "", "", "", "", "", "", "", "A01",
"", "A01", ""), date = c("2015-06-19", "2016-08-15", "2010-03-16",
"2018-01-20", "2017-02-20", "2017-05-01", "2008-01-11", "1201803-18",
"2017-01-19", "2013-01-16", "2009-05-01", "2011-04-03", "2015-05-09"
)), class = "data.frame", row.names = c(NA, -13L))
我执行了以下步骤,并且能够通过 mean
获得频率 table,但我认为应该有更简单的方法,但我仍然无法获得 SD
。 请帮我得到 SD
每 year
.
为了算均值,我新建了一个专栏(consult
),每次咨询1个(基于icpc
):
setDT(df)[, consult := if (any(icpc %in% "")) "1" else "1", ]
df$consult <- as.numeric(df$consult)
从那里:
#consultation frequency per year
df.freq.year <- df %>%
mutate(year = format(date, "%Y")) %>%
group_by(id, year) %>%
summarise(frequency = sum(consult))
#mean consultations per year
df.mean.year <- df.freq.year %>%
group_by(id, year) %>%
summarise(mean = mean(frequency))
#make table with number of patients per year
df.pat <- df %>%
mutate(year = format(date, "%Y")) %>%
group_by(year) %>%
summarise(Nbr.patients = sum(length(unique(id))))
我试过以下方法(不成功):
sqrt(var(df.freq.year$frequency, by = "year"))
我的输出应该是这样的:
year mean SD
1: 2008 5.2 1.3
2: 2009 4.0 1.1
3: 2010 8.9 1.6
4: 2011 4.9 2.1
5: 2012 3.4 1.1
6: 2013 2.3 1.1
7: 2014 9.5 1.3
8: 2015 12.0 2.1
9: 2016 11.4 2.6
10: 2017 8.9 2.0
11: 2018 6.7 2.2
好的,我设法解决了...
#consultation frequency per patient per year
df.freq.patyear <- df %>%
group_by(id, year) %>%
summarise(frequency = sum(consult))
#calculate SD per year
df.sd <- df.freq.patyear %>%
group_by(year) %>%
summarise(SD = sd(frequency))
df.table <- merge(df.mean.year, df.sd, by = "year")