R - 如何重塑数据框,将两列的值合二为一?
R - How to reshape dataframe collating values of two columns in one?
我有一个数据框,我需要对其进行整形以方便它在可视化应用程序中的使用。这是数据框的压缩版本:
Carrier <- c("Mesa", "United", "JetBlue", "ExpressJet", "SkyWest")
Flight_Num <- c(7124, 7177, 334, 1223, 6380)
Origin <- c("ORD", "EWR", "SFO", "BOS", "BDL")
Dest <- c("PIT", "BOI", "DSM", "CWA", "CMH")
Sched_Depr <- c(1955, 1900, 1845, 1253, 1755)
df <- data.frame(Carrier, Flight_Num, Origin, Dest, Sched_Depr)
Carrier Flight_Num Origin Dest Sched_Depr
1 Mesa 7124 ORD PIT 1955
2 United 7177 EWR BOI 1900
3 JetBlue 334 SFO DSM 1845
4 ExpressJet 1223 BOS CWA 1253
5 SkyWest 6380 BDL CMH 1755
Origin
和 Dept
被可视化应用程序解释为地理数据(即坐标)。我需要将它们整理到一个名为 Coords
的列中。同时我需要创建一个新的订单标记变量Order_Points
。所以新的、重塑的数据框看起来像这样:
Carrier Flight_Num Coords Sched_Depr Order_Points
1 Mesa 7124 ORD 1955 1
2 Mesa 7124 PIT 1955 2
3 United 7177 EWR 1900 1
4 United 7177 BOI 1900 2
5 JetBlue 334 SFO 1845 1
6 JetBlue 334 DSM 1845 2
7 ExpressJet 1223 BOS 1253 1
8 ExpressJet 1223 CWA 1253 2
9 SkyWest 6380 BDL 1755 1
10 SkyWest 6380 CMH 1755 2
在保留(和复制)其他变量的同时整理像这样的两列的有效方法是什么?
这是一个使用 tidyverse 函数的选项。我们使用 gather
将数据框从 "wide" 格式转换为 "long" 格式。这还会添加一个列(此处称为 Type
),用于标记 Coords
是 Origin
还是 Dest
。
library(tidyverse)
df.long = df %>%
gather(Type, Coords, Origin, Dest) %>%
arrange(Carrier, desc(Type))
Carrier Flight_Num Sched_Depr Type Coords
1 ExpressJet 1223 1253 Origin BOS
2 ExpressJet 1223 1253 Dest CWA
3 JetBlue 334 1845 Origin SFO
4 JetBlue 334 1845 Dest DSM
5 Mesa 7124 1955 Origin ORD
6 Mesa 7124 1955 Dest PIT
7 SkyWest 6380 1755 Origin BDL
8 SkyWest 6380 1755 Dest CMH
9 United 7177 1900 Origin EWR
10 United 7177 1900 Dest BOI
你也可以使用 base R:
dat <- data.frame(Carrier, Flight_Num, Origin, Dest, Sched_Depr)
df=reshape(dat,idvar = "Carrier",varying = list(3:4),direction = "long")
`row.names<-`(df[order(df[,1]),],NULL)
Carrier Flight_Num Sched_Depr time Origin
1 ExpressJet 1223 1253 1 BOS
2 ExpressJet 1223 1253 2 CWA
3 JetBlue 334 1845 1 SFO
4 JetBlue 334 1845 2 DSM
5 Mesa 7124 1955 1 ORD
6 Mesa 7124 1955 2 PIT
7 SkyWest 6380 1755 1 BDL
8 SkyWest 6380 1755 2 CMH
9 United 7177 1900 1 EWR
10 United 7177 1900 2 BOI
您可以将 time 的变量名称更改为您在上面的示例中喜欢的名称
我有一个数据框,我需要对其进行整形以方便它在可视化应用程序中的使用。这是数据框的压缩版本:
Carrier <- c("Mesa", "United", "JetBlue", "ExpressJet", "SkyWest")
Flight_Num <- c(7124, 7177, 334, 1223, 6380)
Origin <- c("ORD", "EWR", "SFO", "BOS", "BDL")
Dest <- c("PIT", "BOI", "DSM", "CWA", "CMH")
Sched_Depr <- c(1955, 1900, 1845, 1253, 1755)
df <- data.frame(Carrier, Flight_Num, Origin, Dest, Sched_Depr)
Carrier Flight_Num Origin Dest Sched_Depr
1 Mesa 7124 ORD PIT 1955
2 United 7177 EWR BOI 1900
3 JetBlue 334 SFO DSM 1845
4 ExpressJet 1223 BOS CWA 1253
5 SkyWest 6380 BDL CMH 1755
Origin
和 Dept
被可视化应用程序解释为地理数据(即坐标)。我需要将它们整理到一个名为 Coords
的列中。同时我需要创建一个新的订单标记变量Order_Points
。所以新的、重塑的数据框看起来像这样:
Carrier Flight_Num Coords Sched_Depr Order_Points
1 Mesa 7124 ORD 1955 1
2 Mesa 7124 PIT 1955 2
3 United 7177 EWR 1900 1
4 United 7177 BOI 1900 2
5 JetBlue 334 SFO 1845 1
6 JetBlue 334 DSM 1845 2
7 ExpressJet 1223 BOS 1253 1
8 ExpressJet 1223 CWA 1253 2
9 SkyWest 6380 BDL 1755 1
10 SkyWest 6380 CMH 1755 2
在保留(和复制)其他变量的同时整理像这样的两列的有效方法是什么?
这是一个使用 tidyverse 函数的选项。我们使用 gather
将数据框从 "wide" 格式转换为 "long" 格式。这还会添加一个列(此处称为 Type
),用于标记 Coords
是 Origin
还是 Dest
。
library(tidyverse)
df.long = df %>%
gather(Type, Coords, Origin, Dest) %>%
arrange(Carrier, desc(Type))
Carrier Flight_Num Sched_Depr Type Coords 1 ExpressJet 1223 1253 Origin BOS 2 ExpressJet 1223 1253 Dest CWA 3 JetBlue 334 1845 Origin SFO 4 JetBlue 334 1845 Dest DSM 5 Mesa 7124 1955 Origin ORD 6 Mesa 7124 1955 Dest PIT 7 SkyWest 6380 1755 Origin BDL 8 SkyWest 6380 1755 Dest CMH 9 United 7177 1900 Origin EWR 10 United 7177 1900 Dest BOI
你也可以使用 base R:
dat <- data.frame(Carrier, Flight_Num, Origin, Dest, Sched_Depr)
df=reshape(dat,idvar = "Carrier",varying = list(3:4),direction = "long")
`row.names<-`(df[order(df[,1]),],NULL)
Carrier Flight_Num Sched_Depr time Origin
1 ExpressJet 1223 1253 1 BOS
2 ExpressJet 1223 1253 2 CWA
3 JetBlue 334 1845 1 SFO
4 JetBlue 334 1845 2 DSM
5 Mesa 7124 1955 1 ORD
6 Mesa 7124 1955 2 PIT
7 SkyWest 6380 1755 1 BDL
8 SkyWest 6380 1755 2 CMH
9 United 7177 1900 1 EWR
10 United 7177 1900 2 BOI
您可以将 time 的变量名称更改为您在上面的示例中喜欢的名称