计数观察和考虑条件
Counting observations and considering condition
我有这样一个数据库:
id <- c(rep(1,3), rep(2, 3), rep(3, 3))
condition <- c(0, 0, 1, 0, 0, 1, 1, 1, 0)
time_point1 <- c(1, 1, NA)
time_point2 <- c(NA, 1, NA)
time_point3 <- c(NA, NA, NA)
time_point4 <- c(1, NA, NA, 1, NA, NA, NA, NA, 1)
data <- data.frame(id, condition, time_point1, time_point2, time_point3, time_point4)
data
id condition time_point1 time_point2 time_point3 time_point4
1 1 0 1 NA NA 1
2 1 0 1 1 NA NA
3 1 1 NA NA NA NA
4 2 0 1 NA NA 1
5 2 0 1 1 NA NA
6 2 1 NA NA NA NA
7 3 1 1 NA NA NA
8 3 1 1 1 NA NA
9 3 0 NA NA NA 1
我想做一个 table 有多少人有条件 == 1 (n_x) 以及每个时间点有多少人 (n_t)。如果还有 none,我也想要一个 0。我试过这个:
data %>%
pivot_longer(cols = contains("time_point")) %>%
filter (!is.na(value)) %>%
group_by(name) %>%
mutate(n_t = n_distinct(id)) %>%
ungroup() %>%
filter(condition == 1) %>%
group_by(name) %>%
summarise(n_x = n_distinct(id), n_t = first(n_t))
获得这个:
name n_x n_t
<chr> <int> <int>
1 time_point1 1 3
2 time_point2 1 3
期望的结果:我想要这种考虑有条件和没有条件的情况的table:
name n_x n_t
1 time_point1 2 6
2 time_point2 1 3
3 time_point3 0 0
4 time_point4 0 3
谢谢!
您可以 pivot_longer()
能够 group_by()
time_points 然后总结只是将值相加。对于条件,仅对 values != NA
.
列的值求和
data %>%
pivot_longer(cols=c(3:6),names_to = 'point', values_to='values') %>%
group_by(point) %>%
summarise(n_x = sum(condition[!is.na(values)]), n_t = sum(values, na.rm = TRUE))
输出:
# A tibble: 4 x 3
point n_x n_t
<chr> <dbl> <dbl>
1 time_point1 2 6
2 time_point2 1 3
3 time_point3 0 0
4 time_point4 0 3
我有这样一个数据库:
id <- c(rep(1,3), rep(2, 3), rep(3, 3))
condition <- c(0, 0, 1, 0, 0, 1, 1, 1, 0)
time_point1 <- c(1, 1, NA)
time_point2 <- c(NA, 1, NA)
time_point3 <- c(NA, NA, NA)
time_point4 <- c(1, NA, NA, 1, NA, NA, NA, NA, 1)
data <- data.frame(id, condition, time_point1, time_point2, time_point3, time_point4)
data
id condition time_point1 time_point2 time_point3 time_point4
1 1 0 1 NA NA 1
2 1 0 1 1 NA NA
3 1 1 NA NA NA NA
4 2 0 1 NA NA 1
5 2 0 1 1 NA NA
6 2 1 NA NA NA NA
7 3 1 1 NA NA NA
8 3 1 1 1 NA NA
9 3 0 NA NA NA 1
我想做一个 table 有多少人有条件 == 1 (n_x) 以及每个时间点有多少人 (n_t)。如果还有 none,我也想要一个 0。我试过这个:
data %>%
pivot_longer(cols = contains("time_point")) %>%
filter (!is.na(value)) %>%
group_by(name) %>%
mutate(n_t = n_distinct(id)) %>%
ungroup() %>%
filter(condition == 1) %>%
group_by(name) %>%
summarise(n_x = n_distinct(id), n_t = first(n_t))
获得这个:
name n_x n_t
<chr> <int> <int>
1 time_point1 1 3
2 time_point2 1 3
期望的结果:我想要这种考虑有条件和没有条件的情况的table:
name n_x n_t
1 time_point1 2 6
2 time_point2 1 3
3 time_point3 0 0
4 time_point4 0 3
谢谢!
您可以 pivot_longer()
能够 group_by()
time_points 然后总结只是将值相加。对于条件,仅对 values != NA
.
data %>%
pivot_longer(cols=c(3:6),names_to = 'point', values_to='values') %>%
group_by(point) %>%
summarise(n_x = sum(condition[!is.na(values)]), n_t = sum(values, na.rm = TRUE))
输出:
# A tibble: 4 x 3
point n_x n_t
<chr> <dbl> <dbl>
1 time_point1 2 6
2 time_point2 1 3
3 time_point3 0 0
4 time_point4 0 3