在具有缺失值的 R 中按组转置数据帧
Transposing a data-frame by groups in R with missing values
我有一个看起来像
的数据框
Country Variable 2012 2013 2014
Germany Medical 11 2 4
Germany Transport 12 6 8
France Medical 15 10 12
France Transport 17 13 14
France Food 24 14 15
我想转置数据框,使最终数据框采用以下形式:
Country year Medical Transport Food
Germany 2012 11 12 NA
Germany 2013 2 6 NA
Germany 2014 4 8 NA
France 2012 15 17 24
France 2013 10 13 14
France 2014 12 14 15
我尝试了几个函数,包括 melt
、reshape
和 spread
,但它们都不起作用。有人有什么想法吗?
可以先转成长格式再转成宽格式
library(tidyr)
df %>%
pivot_longer(cols = -c(Country, Variable), names_to = "year") %>%
pivot_wider(names_from = Variable, values_from = value)
# A tibble: 6 x 5
# Country year Medical Transport Food
# <fct> <chr> <int> <int> <int>
#1 Germany 2012 11 12 NA
#2 Germany 2013 2 6 NA
#3 Germany 2014 4 8 NA
#4 France 2012 15 17 24
#5 France 2013 10 13 14
#6 France 2014 12 14 15
对于 tidyr
的旧版本,gather
和 spread
df %>%
gather(year, value, -c(Country, Variable)) %>%
spread(Variable, value)
我们还可以使用 transpose
来自 data.table
library(data.table) # v >= 1.12.4
rbindlist(lapply(split(df1[-1], df1$Country), function(x)
data.table::transpose(x, keep.names = 'year', make.names = "Variable")),
idcol = 'Country', fill = TRUE)
# Country year Medical Transport Food
#1: France 2012 15 17 24
#2: France 2013 10 13 14
#3: France 2014 12 14 15
#4: Germany 2012 11 12 NA
#5: Germany 2013 2 6 NA
#6: Germany 2014 4 8 NA
数据
df1 <- structure(list(Country = c("Germany", "Germany", "France", "France",
"France"), Variable = c("Medical", "Transport", "Medical", "Transport",
"Food"), `2012` = c(11L, 12L, 15L, 17L, 24L), `2013` = c(2L,
6L, 10L, 13L, 14L), `2014` = c(4L, 8L, 12L, 14L, 15L)),
class = "data.frame", row.names = c(NA,
-5L))
我有一个看起来像
的数据框
Country Variable 2012 2013 2014
Germany Medical 11 2 4
Germany Transport 12 6 8
France Medical 15 10 12
France Transport 17 13 14
France Food 24 14 15
我想转置数据框,使最终数据框采用以下形式:
Country year Medical Transport Food
Germany 2012 11 12 NA
Germany 2013 2 6 NA
Germany 2014 4 8 NA
France 2012 15 17 24
France 2013 10 13 14
France 2014 12 14 15
我尝试了几个函数,包括 melt
、reshape
和 spread
,但它们都不起作用。有人有什么想法吗?
可以先转成长格式再转成宽格式
library(tidyr)
df %>%
pivot_longer(cols = -c(Country, Variable), names_to = "year") %>%
pivot_wider(names_from = Variable, values_from = value)
# A tibble: 6 x 5
# Country year Medical Transport Food
# <fct> <chr> <int> <int> <int>
#1 Germany 2012 11 12 NA
#2 Germany 2013 2 6 NA
#3 Germany 2014 4 8 NA
#4 France 2012 15 17 24
#5 France 2013 10 13 14
#6 France 2014 12 14 15
对于 tidyr
的旧版本,gather
和 spread
df %>%
gather(year, value, -c(Country, Variable)) %>%
spread(Variable, value)
我们还可以使用 transpose
来自 data.table
library(data.table) # v >= 1.12.4
rbindlist(lapply(split(df1[-1], df1$Country), function(x)
data.table::transpose(x, keep.names = 'year', make.names = "Variable")),
idcol = 'Country', fill = TRUE)
# Country year Medical Transport Food
#1: France 2012 15 17 24
#2: France 2013 10 13 14
#3: France 2014 12 14 15
#4: Germany 2012 11 12 NA
#5: Germany 2013 2 6 NA
#6: Germany 2014 4 8 NA
数据
df1 <- structure(list(Country = c("Germany", "Germany", "France", "France",
"France"), Variable = c("Medical", "Transport", "Medical", "Transport",
"Food"), `2012` = c(11L, 12L, 15L, 17L, 24L), `2013` = c(2L,
6L, 10L, 13L, 14L), `2014` = c(4L, 8L, 12L, 14L, 15L)),
class = "data.frame", row.names = c(NA,
-5L))