如何在 R/dplyr 中随机化组内组的顺序?
How to randomise order of group within group in R/dplyr?
我的数据中有一个组嵌套在另一个组中。我想随机化嵌套组的顺序,同时保留每个嵌套组 within 行的顺序。 (这将是现有管道中的一个步骤,因此 tidyverse 解决方案将是理想的。)
在下面的示例中,我如何随机化 participant_id
中 block
的顺序,同时保留 participant_id
和 trial
的顺序?
library(dplyr)
set.seed(123)
# dummy data
data <- tibble::tribble(
~participant_id, ~block, ~trial,
1L, "a", 1L,
1L, "a", 2L,
1L, "a", 3L,
1L, "b", 1L,
1L, "b", 2L,
1L, "b", 3L,
2L, "a", 1L,
2L, "a", 2L,
2L, "a", 3L,
2L, "b", 1L,
2L, "b", 2L,
2L, "b", 3L
)
# something along the lines of...
new_data <- data %>%
group_by(participant_id) %>%
# ? step here to randomise order within 'block', while preserving order within 'trial'.
谢谢。
一个选项可以是:
data %>%
group_by(participant_id) %>%
mutate(rleid = cumsum(block != lag(block, default = first(block))),
block_random = sample(n())) %>%
group_by(participant_id, rleid) %>%
mutate(block_random = min(block_random)) %>%
ungroup()
participant_id block trial rleid block_random
<int> <chr> <int> <int> <int>
1 1 a 1 0 2
2 1 a 2 0 2
3 1 a 3 0 2
4 1 b 1 1 1
5 1 b 2 1 1
6 1 b 3 1 1
7 2 a 1 0 2
8 2 a 2 0 2
9 2 a 3 0 2
10 2 b 1 1 1
11 2 b 2 1 1
12 2 b 3 1 1
还有一个:
# Randomise within one participant
randomiseGroup <- function(.x, .y) {
# Generalise to that any number of blocks can be handled
r <- .x %>%
distinct(block) %>%
mutate(random=runif(nrow(.)))
# Randomise
.y %>%
bind_cols(
.x %>%
ungroup() %>%
left_join(r, by="block") %>%
arrange(random, trial) %>%
select(-random)
)
}
# Randomise all participants
data %>%
group_by(participant_id) %>%
group_map(randomiseGroup) %>%
bind_rows()
# A tibble: 12 × 3
participant_id block trial
<int> <chr> <int>
1 1 a 1
2 1 a 2
3 1 a 3
4 1 b 1
5 1 b 2
6 1 b 3
7 2 b 1
8 2 b 2
9 2 b 3
10 2 a 1
11 2 a 2
12 2 a 3
我的数据中有一个组嵌套在另一个组中。我想随机化嵌套组的顺序,同时保留每个嵌套组 within 行的顺序。 (这将是现有管道中的一个步骤,因此 tidyverse 解决方案将是理想的。)
在下面的示例中,我如何随机化 participant_id
中 block
的顺序,同时保留 participant_id
和 trial
的顺序?
library(dplyr)
set.seed(123)
# dummy data
data <- tibble::tribble(
~participant_id, ~block, ~trial,
1L, "a", 1L,
1L, "a", 2L,
1L, "a", 3L,
1L, "b", 1L,
1L, "b", 2L,
1L, "b", 3L,
2L, "a", 1L,
2L, "a", 2L,
2L, "a", 3L,
2L, "b", 1L,
2L, "b", 2L,
2L, "b", 3L
)
# something along the lines of...
new_data <- data %>%
group_by(participant_id) %>%
# ? step here to randomise order within 'block', while preserving order within 'trial'.
谢谢。
一个选项可以是:
data %>%
group_by(participant_id) %>%
mutate(rleid = cumsum(block != lag(block, default = first(block))),
block_random = sample(n())) %>%
group_by(participant_id, rleid) %>%
mutate(block_random = min(block_random)) %>%
ungroup()
participant_id block trial rleid block_random
<int> <chr> <int> <int> <int>
1 1 a 1 0 2
2 1 a 2 0 2
3 1 a 3 0 2
4 1 b 1 1 1
5 1 b 2 1 1
6 1 b 3 1 1
7 2 a 1 0 2
8 2 a 2 0 2
9 2 a 3 0 2
10 2 b 1 1 1
11 2 b 2 1 1
12 2 b 3 1 1
还有一个:
# Randomise within one participant
randomiseGroup <- function(.x, .y) {
# Generalise to that any number of blocks can be handled
r <- .x %>%
distinct(block) %>%
mutate(random=runif(nrow(.)))
# Randomise
.y %>%
bind_cols(
.x %>%
ungroup() %>%
left_join(r, by="block") %>%
arrange(random, trial) %>%
select(-random)
)
}
# Randomise all participants
data %>%
group_by(participant_id) %>%
group_map(randomiseGroup) %>%
bind_rows()
# A tibble: 12 × 3
participant_id block trial
<int> <chr> <int>
1 1 a 1
2 1 a 2
3 1 a 3
4 1 b 1
5 1 b 2
6 1 b 3
7 2 b 1
8 2 b 2
9 2 b 3
10 2 a 1
11 2 a 2
12 2 a 3