Pivot_wider 不删除重复项

Pivot_wider without removing duplicates

我想使用 pivot_wider,目标是通过将重复值分开来使结果列数等于旋转的行数。

我的示例数据集:

data <- data.frame(Person = c("Peter", "Peter", "Peter", "Peter", "Peter", "Peter",
                              "Carol", "Carol", "Carol", "Carol", "Carol", "Carol"),
                  GroupID = c(1, 1, 2, 2, 3, 3, 1, 1, 4, 4, 5, 5),
                  GroupTheme = c(1, 1, 1, 1, 2, 2, 1, 1, 2, 2, 2, 2),
                  Committee = c("Transport", "State", "Transport", "State", "Transport", "State",
                                "Technology", "Nature", "Technology", "Nature", "Technology", "Nature"))

我想每个人一行。为此,我需要通过 GroupID 和 groupTheme 扩大数据集。 我想每人一行。请注意,对每个组重复一个人的“委员会”的观察。原始数据集中的每个“名称”都是这样设计的。

目前我用过的代码:

widened = function(col, pre){
  data %>%
    select(Person, {{col}}) %>% 
    distinct() %>%
    with_groups(Person, ~mutate(.x, n = row_number())) %>% 
    pivot_wider(names_from = n, values_from = {{col}}, names_prefix = pre)
}

data <- reduce(list(widened(GroupID, "GroupID_"),
            widened(GroupTheme, "GroupTheme_"),
            widened(Committee, "Committee_")), 
       left_join, by = "Person")

以下数据集的结果:

Person GroupID_1 GroupID_2 GroupID_3 GroupTheme_1 GroupTheme_2 Committee_1 Committee_2
  <chr>      <dbl>     <dbl>     <dbl>        <dbl>        <dbl> <chr>       <chr>      
1 Peter          1         2         3            1            2 Transport   State      
2 Carol          1         4         5            1            2 Technology  Nature 

如您所见,有 3 列带有 GroupID_,但只有 2 列带有 GroupThemes_。这是因为 GroupTheme_ 的最大唯一值数在所有行中为 2。

但是,我希望能够将每个 GroupID_ 与其对应的 GroupTheme_ 相匹配。所以,GroupTheme_1 应该对应于 GroupID_1 等等。 数据集应如下所示:

Person GroupID_1 GroupID_2 GroupID_3 GroupTheme_1 GroupTheme_2 GroupTheme_3 Committee_1
1  Peter         1         2         3            1            1            2   Transport
2  Carol         1         4         5            1            2            2  Technology
  Committee_2
1       State
2      Nature

在我看来,这是通过不删除 GroupTheme_ 列之间的重复值来完成的。这使我可以按编号将每个 GroupID_ 与每个 GroupTheme_ 匹配,就像原始较长数据集中的情况一样。

我尝试了 pivot_wider 的选项,但没有想出办法。

如果您有其他方法(可能更直接)来解决在旋转更宽后能够将每个 ID 与主题匹配的问题,也非常感谢。

提前致谢

data %>%
  group_by(Person) %>%
  mutate(name = as.integer(factor(Committee, unique(Committee))))%>%
  pivot_wider(c(Person, GroupID, GroupTheme), values_from = Committee,
              names_prefix = 'Committee_') %>%
  mutate(name = row_number()) %>%
  pivot_wider(c(Person, starts_with('Committee')), 
              values_from = c(GroupID, GroupTheme))


# A tibble: 2 x 9
# Groups:   Person [2]
  Person Committee_1 Committee_2 GroupID_1 GroupID_2 GroupID_3 GroupTheme_1 GroupTheme_2 GroupTheme_3
  <chr>  <chr>       <chr>           <dbl>     <dbl>     <dbl>        <dbl>        <dbl>        <dbl>
1 Peter  Transport   State               1         2         3            1            1            2
2 Carol  Technology  Nature              1         4         5            1            2            2