R数据集从长到宽——特定条件下
R dataset from long to wide - under a specific condition
我想通过 ID 将一个长的按时间顺序排列的数据集转换为一个宽但按时间顺序排列的数据集
让我们看一个例子:
ID
Product
Date
1
Bike
1/1/2000
1
Tire
2/1/2000
2
Car
15/2/2000
2
Seat
17/2/2000
1
Chronometer
20/2/2000
进入以下table:
ID
1st
2nd
3rd
etc
1
Bike
Tire
Chronometer
2
Car
Seat
购买的商品顺序不得更改。
你们能帮帮我吗?
非常感谢!
arrange
每个 ID
和 Date
的数据,为每个 ID
提供唯一的行号并将数据转换为宽格式。
library(dplyr)
df %>%
mutate(Date = as.Date(Date, '%d/%m/%Y')) %>%
arrange(ID, Date) %>%
group_by(ID) %>%
mutate(row = row_number()) %>%
tidyr::pivot_wider(names_from = row, values_from = c(Product, Date))
# ID Product_1 Product_2 Product_3 Date_1 Date_2 Date_3
# <int> <chr> <chr> <chr> <date> <date> <date>
#1 1 Bike Tire Chronometer 2000-01-01 2000-01-02 2000-02-20
#2 2 Car Seat NA 2000-02-15 2000-02-17 NA
数据
df <- structure(list(ID = c(1L, 1L, 2L, 2L, 1L), Product = c("Bike",
"Tire", "Car", "Seat", "Chronometer"), Date = c("1/1/2000", "2/1/2000",
"15/2/2000", "17/2/2000", "20/2/2000")), class = "data.frame", row.names = c(NA, -5L))
使用 reshape
的基础 R 选项
reshape(
transform(
df,
q = ave(1:nrow(df), ID, FUN = seq_along)
),
direction = "wide",
idvar = "ID",
timevar = "q"
)
给予
ID Product.1 Date.1 Product.2 Date.2 Product.3 Date.3
1 1 Bike 1/1/2000 Tire 2/1/2000 Chronometer 20/2/2000
3 2 Car 15/2/2000 Seat 17/2/2000 <NA> <NA>
如果不想保留Date
,可以试试这个
reshape(
transform(
subset(df, select = -Date),
q = ave(1:nrow(df), ID, FUN = seq_along)
),
direction = "wide",
idvar = "ID",
timevar = "q"
)
这给出了
ID Product.1 Product.2 Product.3
1 1 Bike Tire Chronometer
3 2 Car Seat <NA>
数据
> dput(df)
structure(list(ID = c(1L, 1L, 2L, 2L, 1L), Product = c("Bike",
"Tire", "Car", "Seat", "Chronometer"), Date = c("1/1/2000", "2/1/2000",
"15/2/2000", "17/2/2000", "20/2/2000")), class = "data.frame", row.names = c(NA,
-5L))
我们可以使用 dcast
来自 data.table
library(data.table)
dcast(setDT(df), ID ~ rowid(ID), value.var = c('Product', 'Date'))
# ID Product_1 Product_2 Product_3 Date_1 Date_2 Date_3
#1: 1 Bike Tire Chronometer 1/1/2000 2/1/2000 20/2/2000
#2: 2 Car Seat <NA> 15/2/2000 17/2/2000 <NA>
数据
df <- structure(list(ID = c(1L, 1L, 2L, 2L, 1L), Product = c("Bike",
"Tire", "Car", "Seat", "Chronometer"), Date = c("1/1/2000", "2/1/2000",
"15/2/2000", "17/2/2000", "20/2/2000")), class = "data.frame",
row.names = c(NA,
-5L))
我想通过 ID 将一个长的按时间顺序排列的数据集转换为一个宽但按时间顺序排列的数据集 让我们看一个例子:
ID | Product | Date |
---|---|---|
1 | Bike | 1/1/2000 |
1 | Tire | 2/1/2000 |
2 | Car | 15/2/2000 |
2 | Seat | 17/2/2000 |
1 | Chronometer | 20/2/2000 |
进入以下table:
ID | 1st | 2nd | 3rd | etc |
---|---|---|---|---|
1 | Bike | Tire | Chronometer | |
2 | Car | Seat |
购买的商品顺序不得更改。
你们能帮帮我吗?
非常感谢!
arrange
每个 ID
和 Date
的数据,为每个 ID
提供唯一的行号并将数据转换为宽格式。
library(dplyr)
df %>%
mutate(Date = as.Date(Date, '%d/%m/%Y')) %>%
arrange(ID, Date) %>%
group_by(ID) %>%
mutate(row = row_number()) %>%
tidyr::pivot_wider(names_from = row, values_from = c(Product, Date))
# ID Product_1 Product_2 Product_3 Date_1 Date_2 Date_3
# <int> <chr> <chr> <chr> <date> <date> <date>
#1 1 Bike Tire Chronometer 2000-01-01 2000-01-02 2000-02-20
#2 2 Car Seat NA 2000-02-15 2000-02-17 NA
数据
df <- structure(list(ID = c(1L, 1L, 2L, 2L, 1L), Product = c("Bike",
"Tire", "Car", "Seat", "Chronometer"), Date = c("1/1/2000", "2/1/2000",
"15/2/2000", "17/2/2000", "20/2/2000")), class = "data.frame", row.names = c(NA, -5L))
使用 reshape
reshape(
transform(
df,
q = ave(1:nrow(df), ID, FUN = seq_along)
),
direction = "wide",
idvar = "ID",
timevar = "q"
)
给予
ID Product.1 Date.1 Product.2 Date.2 Product.3 Date.3
1 1 Bike 1/1/2000 Tire 2/1/2000 Chronometer 20/2/2000
3 2 Car 15/2/2000 Seat 17/2/2000 <NA> <NA>
如果不想保留Date
,可以试试这个
reshape(
transform(
subset(df, select = -Date),
q = ave(1:nrow(df), ID, FUN = seq_along)
),
direction = "wide",
idvar = "ID",
timevar = "q"
)
这给出了
ID Product.1 Product.2 Product.3
1 1 Bike Tire Chronometer
3 2 Car Seat <NA>
数据
> dput(df)
structure(list(ID = c(1L, 1L, 2L, 2L, 1L), Product = c("Bike",
"Tire", "Car", "Seat", "Chronometer"), Date = c("1/1/2000", "2/1/2000",
"15/2/2000", "17/2/2000", "20/2/2000")), class = "data.frame", row.names = c(NA,
-5L))
我们可以使用 dcast
来自 data.table
library(data.table)
dcast(setDT(df), ID ~ rowid(ID), value.var = c('Product', 'Date'))
# ID Product_1 Product_2 Product_3 Date_1 Date_2 Date_3
#1: 1 Bike Tire Chronometer 1/1/2000 2/1/2000 20/2/2000
#2: 2 Car Seat <NA> 15/2/2000 17/2/2000 <NA>
数据
df <- structure(list(ID = c(1L, 1L, 2L, 2L, 1L), Product = c("Bike",
"Tire", "Car", "Seat", "Chronometer"), Date = c("1/1/2000", "2/1/2000",
"15/2/2000", "17/2/2000", "20/2/2000")), class = "data.frame",
row.names = c(NA,
-5L))