根据R中的日期将行转换为列
Converting rows to columns based on Dates in R
假设我有一个如下所示的数据框:
car <- c(rep("Acura", 2),rep("Benz", 2), rep("Audi", 2), rep("Toyota", 2), rep("Honda", 2))
date <- rep(c(as.Date("01/01/21"), as.Date("30/01/21")), 5)
mileage <- c(1200, 2400, 1000, 2000, 34000, 34500, 54, 1200, 4000,6000)
avg_speed <- c(24,54,100, 72,65,56,98,NA,55,50)
ex <- data.frame(car, date, mileage, avg_speed)
所以 ex 看起来像:
> ex
car date mileage avg_speed
1 Acura 1-01-21 1200 24
2 Acura 30-01-21 2400 54
3 Benz 1-01-21 1000 100
4 Benz 30-01-21 2000 72
5 Audi 1-01-21 34000 65
6 Audi 30-01-21 34500 56
7 Toyota 1-01-21 54 98
8 Toyota 30-01-21 1200 NA
9 Honda 1-01-21 4000 55
10 Honda 30-01-21 6000 50
我希望我的数据框看起来像这样:
car mileage_1_01_21 mileage_30_01_21 avg_speed_1_01_21 avg_speed_30_01_21
1 Acura 1200 2400 24 54
2 Benz 1000 2000 100 72
3 Audi 34000 34500 65 56
4 Toyota 54 1200 98 NA
5 Honda 4000 6000 55 50
基本上我想按每个日期对列进行分层。
实现这种结构的最佳方式是什么?
您可以使用 data.table
包中的 dcast
:
library(data.table)
car <- c(rep("Acura", 2),rep("Benz", 2), rep("Audi", 2), rep("Toyota", 2), rep("Honda", 2))
date <- rep(c(as.Date("01/01/21"), as.Date("30/01/21")), 5)
mileage <- c(1200, 2400, 1000, 2000, 34000, 34500, 54, 1200, 4000,6000)
avg_speed <- c(24,54,100, 72,65,56,98,NA,55,50)
ex <- data.frame(car, date, mileage, avg_speed)
setDT(ex)
ex2 <- dcast(ex,
formula = car ~ date,
value.var = c("mileage", "avg_speed")
)
> ex2
car mileage_0001-01-21 mileage_0030-01-21 avg_speed_0001-01-21 avg_speed_0030-01-21
1: Acura 1200 2400 24 54
2: Audi 34000 34500 65 56
3: Benz 1000 2000 100 72
4: Honda 4000 6000 55 50
5: Toyota 54 1200 98 NA
像下面那样尝试 reshape
reshape(
ex,
direction = "wide",
idvar = "car",
timevar = "date"
)
给予
car mileage.0001-01-21 avg_speed.0001-01-21 mileage.0030-01-21
1 Acura 1200 24 2400
3 Benz 1000 100 2000
5 Audi 34000 65 34500
7 Toyota 54 98 1200
9 Honda 4000 55 6000
avg_speed.0030-01-21
1 54
3 72
5 56
7 NA
9 50
假设我有一个如下所示的数据框:
car <- c(rep("Acura", 2),rep("Benz", 2), rep("Audi", 2), rep("Toyota", 2), rep("Honda", 2))
date <- rep(c(as.Date("01/01/21"), as.Date("30/01/21")), 5)
mileage <- c(1200, 2400, 1000, 2000, 34000, 34500, 54, 1200, 4000,6000)
avg_speed <- c(24,54,100, 72,65,56,98,NA,55,50)
ex <- data.frame(car, date, mileage, avg_speed)
所以 ex 看起来像:
> ex
car date mileage avg_speed
1 Acura 1-01-21 1200 24
2 Acura 30-01-21 2400 54
3 Benz 1-01-21 1000 100
4 Benz 30-01-21 2000 72
5 Audi 1-01-21 34000 65
6 Audi 30-01-21 34500 56
7 Toyota 1-01-21 54 98
8 Toyota 30-01-21 1200 NA
9 Honda 1-01-21 4000 55
10 Honda 30-01-21 6000 50
我希望我的数据框看起来像这样:
car mileage_1_01_21 mileage_30_01_21 avg_speed_1_01_21 avg_speed_30_01_21
1 Acura 1200 2400 24 54
2 Benz 1000 2000 100 72
3 Audi 34000 34500 65 56
4 Toyota 54 1200 98 NA
5 Honda 4000 6000 55 50
基本上我想按每个日期对列进行分层。 实现这种结构的最佳方式是什么?
您可以使用 data.table
包中的 dcast
:
library(data.table)
car <- c(rep("Acura", 2),rep("Benz", 2), rep("Audi", 2), rep("Toyota", 2), rep("Honda", 2))
date <- rep(c(as.Date("01/01/21"), as.Date("30/01/21")), 5)
mileage <- c(1200, 2400, 1000, 2000, 34000, 34500, 54, 1200, 4000,6000)
avg_speed <- c(24,54,100, 72,65,56,98,NA,55,50)
ex <- data.frame(car, date, mileage, avg_speed)
setDT(ex)
ex2 <- dcast(ex,
formula = car ~ date,
value.var = c("mileage", "avg_speed")
)
> ex2
car mileage_0001-01-21 mileage_0030-01-21 avg_speed_0001-01-21 avg_speed_0030-01-21
1: Acura 1200 2400 24 54
2: Audi 34000 34500 65 56
3: Benz 1000 2000 100 72
4: Honda 4000 6000 55 50
5: Toyota 54 1200 98 NA
像下面那样尝试 reshape
reshape(
ex,
direction = "wide",
idvar = "car",
timevar = "date"
)
给予
car mileage.0001-01-21 avg_speed.0001-01-21 mileage.0030-01-21
1 Acura 1200 24 2400
3 Benz 1000 100 2000
5 Audi 34000 65 34500
7 Toyota 54 98 1200
9 Honda 4000 55 6000
avg_speed.0030-01-21
1 54
3 72
5 56
7 NA
9 50