如何通过相似的列重塑数据框
How to reshape a dataframe by similar columns
我有一个包含如下五列的数据框:
id p1 p2 time group
___ ___ ___ ____ _______
1 1.2 1.9 2016-10-09 01:00:00 1
1 1.8 1.3 2016-10-09 03:00:00 1
1 1.2 1.9 2016-10-09 03:00:00 2
1 1.8 1.3 2016-10-09 06:00:00 2
3 1.2 1.9 2016-10-09 09:00:00 1
3 1.8 1.3 2016-10-09 12:00:00 1
由此我需要为每个 id 和每个组重塑长到宽,如下所示:
id group p1_start p2_start time_start p1_complete p2_complete time_complete
___ ______ __________ ________ ___________ ________ ______ __________ ________
1 1 1.2 1.9 2016-10-09 01:00:00 1.2 1.9 2016-10-09 03:00:00
1 2 1.2 1.9 2016-10-09 06:00:00 1.2 1.9 2016-10-09 03:00:00
3 1 1.2 1.9 2016-10-09 09:00:00 1.2 1.9 2016-10-09 12:00:00
所以我尝试了
reshape(DT, idvar = c("id","group"), timevar = "group", direction = "wide")
但这导致了预期之外的输出。
感谢任何帮助。
试试这个,df
是你的原始数据。
library(data.table)
setDT(df)
df <- df[, c(.SD[1,], .SD[2,]), by = c('id', 'group')]
names(df) <- c('id', 'group', 'p1_start', 'p2_start', 'time_start', 'p1_complete', 'p2_complete', 'time_complete')
如果您不坚持data.table
解决方案:
library(dplyr) # for pipes `%>%`
library(tidyr) # for `spread`
df %>%
cbind(spread_grp = c("start","complete")) %>% # adds column which alternates "start" and "complete"
nest(p1,p2,time) %>% # nest the columns we want to spread
spread(spread_grp,data) %>% # spreads our nested column
unnest(.sep="_") # unnest, concatenating the original colum names with the spread_grp values
# id group complete_p1 complete_p2 complete_time start_p1 start_p2 start_time
# 1 1 1 1.8 1.3 2016-10-09 03:00:00 1.2 1.9 2016-10-09 01:00:00
# 2 1 2 1.8 1.3 2016-10-09 06:00:00 1.2 1.9 2016-10-09 03:00:00
# 3 3 1 1.8 1.3 2016-10-09 12:00:00 1.2 1.9 2016-10-09 09:00:00
名称与您预期的输出不完全相同,希望这不是问题。
数据
df <- read.table(text="id p1 p2 time group
1 1.2 1.9 '2016-10-09 01:00:00' 1
1 1.8 1.3 '2016-10-09 03:00:00' 1
1 1.2 1.9 '2016-10-09 03:00:00' 2
1 1.8 1.3 '2016-10-09 06:00:00' 2
3 1.2 1.9 '2016-10-09 09:00:00' 1
3 1.8 1.3 '2016-10-09 12:00:00' 1",stringsAsFactor = FALSE,header=TRUE)
我有一个包含如下五列的数据框:
id p1 p2 time group
___ ___ ___ ____ _______
1 1.2 1.9 2016-10-09 01:00:00 1
1 1.8 1.3 2016-10-09 03:00:00 1
1 1.2 1.9 2016-10-09 03:00:00 2
1 1.8 1.3 2016-10-09 06:00:00 2
3 1.2 1.9 2016-10-09 09:00:00 1
3 1.8 1.3 2016-10-09 12:00:00 1
由此我需要为每个 id 和每个组重塑长到宽,如下所示:
id group p1_start p2_start time_start p1_complete p2_complete time_complete
___ ______ __________ ________ ___________ ________ ______ __________ ________
1 1 1.2 1.9 2016-10-09 01:00:00 1.2 1.9 2016-10-09 03:00:00
1 2 1.2 1.9 2016-10-09 06:00:00 1.2 1.9 2016-10-09 03:00:00
3 1 1.2 1.9 2016-10-09 09:00:00 1.2 1.9 2016-10-09 12:00:00
所以我尝试了
reshape(DT, idvar = c("id","group"), timevar = "group", direction = "wide")
但这导致了预期之外的输出。
感谢任何帮助。
试试这个,df
是你的原始数据。
library(data.table)
setDT(df)
df <- df[, c(.SD[1,], .SD[2,]), by = c('id', 'group')]
names(df) <- c('id', 'group', 'p1_start', 'p2_start', 'time_start', 'p1_complete', 'p2_complete', 'time_complete')
如果您不坚持data.table
解决方案:
library(dplyr) # for pipes `%>%`
library(tidyr) # for `spread`
df %>%
cbind(spread_grp = c("start","complete")) %>% # adds column which alternates "start" and "complete"
nest(p1,p2,time) %>% # nest the columns we want to spread
spread(spread_grp,data) %>% # spreads our nested column
unnest(.sep="_") # unnest, concatenating the original colum names with the spread_grp values
# id group complete_p1 complete_p2 complete_time start_p1 start_p2 start_time
# 1 1 1 1.8 1.3 2016-10-09 03:00:00 1.2 1.9 2016-10-09 01:00:00
# 2 1 2 1.8 1.3 2016-10-09 06:00:00 1.2 1.9 2016-10-09 03:00:00
# 3 3 1 1.8 1.3 2016-10-09 12:00:00 1.2 1.9 2016-10-09 09:00:00
名称与您预期的输出不完全相同,希望这不是问题。
数据
df <- read.table(text="id p1 p2 time group
1 1.2 1.9 '2016-10-09 01:00:00' 1
1 1.8 1.3 '2016-10-09 03:00:00' 1
1 1.2 1.9 '2016-10-09 03:00:00' 2
1 1.8 1.3 '2016-10-09 06:00:00' 2
3 1.2 1.9 '2016-10-09 09:00:00' 1
3 1.8 1.3 '2016-10-09 12:00:00' 1",stringsAsFactor = FALSE,header=TRUE)