根据连续的唯一值重新排列行

Question

我有以下包含重复列的数据集，我想按以下方式堆叠它们。我可以使用 bind_rows 获得所需的输出，但我想尝试使用 tidyr 函数：

df <- tibble(
  runs = c(1, 2, 3, 4),
  col1 = c(3, 4, 5, 5),
  col2 = c(5, 3, 1, 4), 
  col3 = c(6, 4, 9, 2),
  col1 = c(0, 2, 2, 1),
  col2 = c(2, 3, 1, 7), 
  col3 = c(2, 4, 9, 9),
  col1 = c(3, 4, 5, 7),
  col2 = c(3, 3, 1, 4), 
  col3 = c(3, 2, NA, NA), .name_repair = "minimal")

df %>%
  select(runs, 2:4) %>%
  bind_rows(df %>%
              select(runs, 5:7)) %>%
  bind_rows(df %>%
              select(runs, 8:10))

# A tibble: 12 x 4            # This is my desired output in a way that column runs is a repeated number of 1 to 4
    runs  col1  col2  col3
   <dbl> <dbl> <dbl> <dbl>
 1     1     3     5     6
 2     2     4     3     4
 3     3     5     1     9
 4     4     5     4     2
 5     1     0     2     2
 6     2     2     3     4
 7     3     2     1     9
 8     4     1     7     9
 9     1     3     3     3
10     2     4     3     2
11     3     5     1    NA
12     4     7     4    NA

然而，当我使用 tidyr 时，runs 的排列方式有所不同。

df %>%
  pivot_longer(-runs) %>%
  group_by(name) %>% 
  mutate(id = row_number()) %>%
  pivot_wider(names_from = name, values_from = value) %>%
  select(-id)

# A tibble: 12 x 4
    runs  col1  col2  col3
   <dbl> <dbl> <dbl> <dbl>
 1     1     3     5     6
 2     1     0     2     2
 3     1     3     3     3
 4     2     4     3     4
 5     2     2     3     4
 6     2     4     3     2
 7     3     5     1     9
 8     3     2     1     9
 9     3     5     1    NA
10     4     5     4     2
11     4     1     7     9
12     4     7     4    NA

如果你能告诉我如何重新排列 runs 以便数字是连续的而不是连续三个 1 和...... 非常感谢您。

Answer 1

可能有更优雅的方法来做到这一点，但你能不能简单地按运行分组并使用行号来排列。

    df %>%
      pivot_longer(cols = starts_with("col"),
                   names_to = c(".value")) %>%
      group_by(runs) %>%
      mutate(grp_n = row_number()) %>%
      ungroup() %>%
      arrange(grp_n, runs)

# A tibble: 12 x 5
    runs  col1  col2  col3 grp_n
   <dbl> <dbl> <dbl> <dbl> <int>
 1     1     3     5     6     1
 2     2     4     3     4     1
 3     3     5     1     9     1
 4     4     5     4     2     1
 5     1     0     2     2     2
 6     2     2     3     4     2
 7     3     2     1     9     2
 8     4     1     7     9     2
 9     1     3     3     3     3
10     2     4     3     2     3
11     3     5     1    NA     3
12     4     7     4    NA     3

Answer 2

使用 split.default 的基础 R 选项：

data.frame(runs = df$runs, 
           sapply(split.default(df[-1], names(df)[-1]), unlist),row.names = NULL)

#   runs col1 col2 col3
#1     1    3    5    6
#2     2    4    3    4
#3     3    5    1    9
#4     4    5    4    2
#5     1    0    2    2
#6     2    2    3    4
#7     3    2    1    9
#8     4    1    7    9
#9     1    3    3    3
#10    2    4    3    2
#11    3    5    1   NA
#12    4    7    4   NA

根据连续的唯一值重新排列行

Rearranging the rows based on a sequential unique values

r

tidyr