如何将多列长数据重塑为宽数据?

How can I reshape multiple columns of long data to wide?

structure(tibble(c("top", "jng", "mid", "bot", "sup"), c("369", "Karsa", "knight", "JackeyLove", "yuyanjia"), 
           c("Malphite", "Rek'Sai",  "Zoe", "Aphelios", "Braum"), c("1", "1", "1", "1", "1"), c("7", "5", "7", "5", "0"), 
           c("6079-7578", "6079-7578", "6079-7578", "6079-7578", "6079-7578")), .Names = c("position", "player", "champion", "result", "kills", "gameid"))

输出:

# A tibble: 5 x 6
  position player     champion result kills gameid   
* <chr>    <chr>      <chr>    <chr>  <chr> <chr>    
1 top      369        Malphite 1      7     6079-7578
2 jng      Karsa      Rek'Sai  1      5     6079-7578
3 mid      knight     Zoe      1      7     6079-7578
4 bot      JackeyLove Aphelios 1      5     6079-7578
5 sup      yuyanjia   Braum    1      0     6079-7578

我想要的输出是:

structure(list(gameid = "6079-7578", result = "1", player_top = "369", 
    player_jng = "Karsa", player_mid = "knight", player_bot = "JackeyLove", 
    player_sup = "yuyanjia", champion_top = "Malphite", champion_jng = "Rek'Sai", 
    champion_mid = "Zoe", champion_bot = "Aphelios", champion_sup = "Braum", 
    kills_top = "7", kills_jng = "5", kills_mid = "7", kills_bot = "5", 
    kills_sup = "0"), row.names = c(NA, -1L), class = c("tbl_df", 
"tbl", "data.frame")) 

看起来像这样:

     gameid result player_top player_jng player_mid player_bot player_sup champion_top champion_jng champion_mid champion_bot champion_sup
1 6079-7578      1        369      Karsa     knight JackeyLove   yuyanjia     Malphite      RekSai          Zoe     Aphelios        Braum
  kills_top kills_jng kills_mid kills_bot kills_sup
1         7         5         7         5         0

我知道我应该使用 pivot_wider() 和 drop_na 之类的东西,但我不知道如何对多个列执行 pivot_wider() 并折叠行同时。任何帮助将不胜感激。

使用 data.table 可能会有所帮助。在 dcast() 中,每一行将由 gameid 和结果的唯一组合标识,列将按位置分布,并填充 value.var.

中列出的变量的值
library(data.table)
library(dplyr)
df <- structure(tibble(c("top", "jng", "mid", "bot", "sup"), c("369", "Karsa", "knight", "JackeyLove", "yuyanjia"), 
                 c("Malphite", "Rek'Sai",  "Zoe", "Aphelios", "Braum"), c("1", "1", "1", "1", "1"), c("7", "5", "7", "5", "0"), 
                 c("6079-7578", "6079-7578", "6079-7578", "6079-7578", "6079-7578")), .Names = c("position", "player", "champion", "result", "kills", "gameid"))

df2 <- dcast(setDT(df), gameid + result~position, value.var = list('player','champion','kills'))

您可以为此使用 pivot_wider(),将“position”变量定义为新列名称来自 names_from 的变量,以及您要用来填充值的三个变量那些带有 values_from.

的列

默认情况下,多个 values_from 变量被粘贴到新列名称的前面。这可以更改,但在本例中匹配您想要的命名结构。

原始数据集中的所有其他变量将按照它们出现的顺序用作 id_cols

library(tidyr)
pivot_wider(dat,  
            names_from = "position", 
            values_from = c("player", "champion", "kills"))
#>   result    gameid player_top player_jng player_mid player_bot player_sup
#> 1      1 6079-7578        369      Karsa     knight JackeyLove   yuyanjia
#>   champion_top champion_jng champion_mid champion_bot champion_sup kills_top
#> 1     Malphite      Rek'Sai          Zoe     Aphelios        Braum         7
#>   kills_jng kills_mid kills_bot kills_sup
#> 1         5         7         5         0

您可以通过 id_cols 明确写出它们来控制输出中 id 列的顺序。这是一个示例,匹配您想要的输出。

pivot_wider(dat, id_cols = c("gameid", "result"), 
            names_from = "position", 
            values_from = c("player", "champion", "kills"))
#>      gameid result player_top player_jng player_mid player_bot player_sup
#> 1 6079-7578      1        369      Karsa     knight JackeyLove   yuyanjia
#>   champion_top champion_jng champion_mid champion_bot champion_sup kills_top
#> 1     Malphite      Rek'Sai          Zoe     Aphelios        Braum         7
#>   kills_jng kills_mid kills_bot kills_sup
#> 1         5         7         5         0

reprex package (v2.0.0)

于 2021-06-24 创建