如何在 R/dplyr 中随机化组内组的顺序？

Question

我的数据中有一个组嵌套在另一个组中。我想随机化嵌套组的顺序，同时保留每个嵌套组 within 行的顺序。（这将是现有管道中的一个步骤，因此 tidyverse 解决方案将是理想的。）

在下面的示例中，我如何随机化 participant_id 中 block 的顺序，同时保留 participant_id 和 trial 的顺序？

library(dplyr)
set.seed(123)

# dummy data
data <- tibble::tribble(
          ~participant_id, ~block, ~trial,
                       1L,    "a",     1L,
                       1L,    "a",     2L,
                       1L,    "a",     3L,
                       1L,    "b",     1L,
                       1L,    "b",     2L,
                       1L,    "b",     3L,
                       2L,    "a",     1L,
                       2L,    "a",     2L,
                       2L,    "a",     3L,
                       2L,    "b",     1L,
                       2L,    "b",     2L,
                       2L,    "b",     3L
          )


# something along the lines of...

new_data <- data %>% 
  group_by(participant_id) %>%
  # ? step here to randomise order within 'block', while preserving order within 'trial'.

谢谢。

Answer 1

一个选项可以是：

data %>%
    group_by(participant_id) %>%
    mutate(rleid = cumsum(block != lag(block, default = first(block))),
           block_random = sample(n())) %>%
    group_by(participant_id, rleid) %>%
    mutate(block_random = min(block_random)) %>%
    ungroup()

   participant_id block trial rleid block_random
            <int> <chr> <int> <int>        <int>
 1              1 a         1     0            2
 2              1 a         2     0            2
 3              1 a         3     0            2
 4              1 b         1     1            1
 5              1 b         2     1            1
 6              1 b         3     1            1
 7              2 a         1     0            2
 8              2 a         2     0            2
 9              2 a         3     0            2
10              2 b         1     1            1
11              2 b         2     1            1
12              2 b         3     1            1

Answer 2

还有一个：

# Randomise within one participant
randomiseGroup <- function(.x, .y) {
  # Generalise to that any number of blocks can be handled
  r <- .x %>% 
    distinct(block) %>% 
    mutate(random=runif(nrow(.)))
  # Randomise
  .y %>% 
    bind_cols(
      .x %>% 
        ungroup() %>% 
        left_join(r, by="block") %>% 
        arrange(random, trial) %>% 
        select(-random)
    )
}

# Randomise all participants
data %>% 
  group_by(participant_id) %>% 
  group_map(randomiseGroup) %>% 
  bind_rows()
# A tibble: 12 × 3
   participant_id block trial
            <int> <chr> <int>
 1              1 a         1
 2              1 a         2
 3              1 a         3
 4              1 b         1
 5              1 b         2
 6              1 b         3
 7              2 b         1
 8              2 b         2
 9              2 b         3
10              2 a         1
11              2 a         2
12              2 a         3

如何在 R/dplyr 中随机化组内组的顺序？

How to randomise order of group within group in R/dplyr?

random

r

dplyr

tidyverse