在具有缺失值的 R 中按组转置数据帧

Transposing a data-frame by groups in R with missing values

我有一个看起来像

的数据框

Country    Variable      2012     2013    2014
Germany    Medical       11       2       4
Germany    Transport     12       6       8
France     Medical       15       10      12
France     Transport     17       13      14  
France     Food          24       14      15

我想转置数据框,使最终数据框采用以下形式:

Country     year    Medical    Transport     Food 
Germany     2012    11         12            NA
Germany     2013    2          6             NA
Germany     2014    4          8             NA
France      2012    15         17            24
France      2013    10         13            14  
France      2014    12         14            15

我尝试了几个函数,包括 meltreshapespread,但它们都不起作用。有人有什么想法吗?

可以先转成长格式再转成宽格式

library(tidyr)

df %>%
  pivot_longer(cols = -c(Country, Variable), names_to = "year") %>%
  pivot_wider(names_from = Variable, values_from = value)

# A tibble: 6 x 5
#  Country year  Medical Transport  Food
#  <fct>   <chr>   <int>     <int> <int>
#1 Germany 2012       11        12    NA
#2 Germany 2013        2         6    NA
#3 Germany 2014        4         8    NA
#4 France  2012       15        17    24
#5 France  2013       10        13    14
#6 France  2014       12        14    15

对于 tidyr 的旧版本,gatherspread

df %>%
  gather(year, value, -c(Country, Variable)) %>%
  spread(Variable, value)

我们还可以使用 transpose 来自 data.table

library(data.table) # v >= 1.12.4 
rbindlist(lapply(split(df1[-1], df1$Country), function(x) 
   data.table::transpose(x, keep.names = 'year', make.names = "Variable")), 
      idcol = 'Country', fill = TRUE)
#   Country year Medical Transport Food
#1:  France 2012      15        17   24
#2:  France 2013      10        13   14
#3:  France 2014      12        14   15
#4: Germany 2012      11        12   NA
#5: Germany 2013       2         6   NA
#6: Germany 2014       4         8   NA

数据

df1 <- structure(list(Country = c("Germany", "Germany", "France", "France", 
"France"), Variable = c("Medical", "Transport", "Medical", "Transport", 
"Food"), `2012` = c(11L, 12L, 15L, 17L, 24L), `2013` = c(2L, 
6L, 10L, 13L, 14L), `2014` = c(4L, 8L, 12L, 14L, 15L)), 
 class = "data.frame", row.names = c(NA, 
-5L))