每个人的观察次数中位数
Median number of observations per person
我有一个长格式的数据框,显示一组人的身高重复测量值。
观察的平均数计算为 2000/500 = 每个 child.
4 个观察
如何计算每个 child 的观测值数量的中值和四分位距?
data <- data.frame(
child_id = 1:500,
height_1 = rnorm(500, mean = 80, sd = 2),
height_2 = rnorm(500, mean = 90, sd = 2),
height_3 = rnorm(500, mean = 100, sd = 2),
height_4 = rnorm(500, mean = 115, sd = 2)
)
data_long <- reshape(data, varying=c(
"height_1", "height_2", "height_3", "height_4"),
direction= "long", idvar="child_id", timevar = "time", sep="_"
)
# Mean observation per child = 2000/500 = 4
data_long$id_f <- as.factor(data_long$child_id)
length(unique(data_long$id_f)) # 500 children
length(data_long$height) # 2000 observations
我们可以使用dplyr
。按'child_id'分组,得到'height'列的median
和IQR
library(dplyr)
data_long %>%
group_by(child_id) %>%
summarise(median = median(height),
interQuartileRange = IQR(height))
如果我们想要median
和IQR
基于观测值
data_long %>%
count(child_id) %>%
summarise(median = median(n), IQR = IQR(n))
我有一个长格式的数据框,显示一组人的身高重复测量值。
观察的平均数计算为 2000/500 = 每个 child.
4 个观察如何计算每个 child 的观测值数量的中值和四分位距?
data <- data.frame(
child_id = 1:500,
height_1 = rnorm(500, mean = 80, sd = 2),
height_2 = rnorm(500, mean = 90, sd = 2),
height_3 = rnorm(500, mean = 100, sd = 2),
height_4 = rnorm(500, mean = 115, sd = 2)
)
data_long <- reshape(data, varying=c(
"height_1", "height_2", "height_3", "height_4"),
direction= "long", idvar="child_id", timevar = "time", sep="_"
)
# Mean observation per child = 2000/500 = 4
data_long$id_f <- as.factor(data_long$child_id)
length(unique(data_long$id_f)) # 500 children
length(data_long$height) # 2000 observations
我们可以使用dplyr
。按'child_id'分组,得到'height'列的median
和IQR
library(dplyr)
data_long %>%
group_by(child_id) %>%
summarise(median = median(height),
interQuartileRange = IQR(height))
如果我们想要median
和IQR
基于观测值
data_long %>%
count(child_id) %>%
summarise(median = median(n), IQR = IQR(n))