在 R 中重塑大型数据集

Question

我正在尝试重塑大型数据集，但无法按我想要的正确顺序获得结果。

数据如下：

GeoFIPS GeoName IndustryID  Description X2001   X2002   X2003   X2004   X2005 
10180   Abilene, TX     21  Mining      96002   92407   127138 150449   202926
10180   Abilene, TX     22  Utilities   33588   34116   33105   33265   32452
...

数据框很长，包括美国所有 MSA 和选定的行业。

我希望它看起来像这样：

GeoFIPS GeoName        Year     Mining Utilities (etc)
10180   Abilene, TX    2001     96002   33588
10180   Abilene, TX    2002     92407   34116
....

我是 R 的新手，非常感谢您的帮助。我检查了从宽到长和从长到宽，但这似乎是一个更复杂的情况。谢谢！

编辑：数据

df1 <- structure(list(GeoFIPS = c(10180L, 10180L), GeoName =
c("Abilene, TX", 
"Abilene, TX"), IndustryID = 21:22, Description = c("Mining", 
"Utilities"), X2001 = c(96002L, 33588L), X2002 = c(92407L, 34116L
), X2003 = c(127138L, 33105L), X2004 = c(150449L, 33265L), X2005 = 
c(202926L, 
 32452L)), .Names = c("GeoFIPS", "GeoName", "IndustryID", "Description", 
"X2001", "X2002", "X2003", "X2004", "X2005"), class = "data.frame",
 row.names = c(NA, -2L))

Answer 1

您可以使用 reshape2

中的 melt/dcast

library(reshape2)
df2 <- melt(df1, id.var=c('GeoFIPS', 'GeoName', 
               'IndustryID', 'Description'))
df2 <- transform(df2, Year=sub('^X', '', variable))[-c(3,5)]


dcast(df2, ...~Description, value.var='value')
#  GeoFIPS     GeoName Year Mining Utilities
#1   10180 Abilene, TX 2001  96002     33588
#2   10180 Abilene, TX 2002  92407     34116
#3   10180 Abilene, TX 2003 127138     33105
#4   10180 Abilene, TX 2004 150449     33265
#5   10180 Abilene, TX 2005 202926     32452

数据

df1 <- structure(list(GeoFIPS = c(10180L, 10180L), GeoName =
c("Abilene, TX", 
"Abilene, TX"), IndustryID = 21:22, Description = c("Mining", 
"Utilities"), X2001 = c(96002L, 33588L), X2002 = c(92407L, 34116L
), X2003 = c(127138L, 33105L), X2004 = c(150449L, 33265L), X2005 = 
c(202926L, 
 32452L)), .Names = c("GeoFIPS", "GeoName", "IndustryID", "Description", 
"X2001", "X2002", "X2003", "X2004", "X2005"), class = "data.frame",
 row.names = c(NA, -2L))

在 R 中重塑大型数据集

Reshaping large dataset in R

r

reshape

reshape2

数据