R 中的复杂重组:字符串、数字和日期
Complex Restructure in R: strings, numeric and dates
我有一个广泛的数据集,其中每一行(一个人)提供 最多 三个不同日期的三个观察值。每个观察结果都包含日期、描述和分钟数。个人可以根据自己的意愿提供尽可能多的观察结果,并且可以出现在不止一行中并带有额外的观察结果。
测试数据在这里:
library(RCurl)
fwt <- getURL("https://raw.githubusercontent.com/bac3917/Cauldron/master/fwt.csv")
fwt<-read.csv(text=fwt)
以正确的格式转换列:
library(lubridate)
fwt$date1<-as.Date(fwt$date1, format='%m/%d/%Y')
fwt$date2<-as.Date(fwt$date2, format='%m/%d/%Y')
fwt$date3<-as.Date(fwt$date3, format='%m/%d/%Y')
# condense dataset; 3 sets of columns into 1
cols <- names(fwt) %in% c("naecy1_2","naecy1_1","naecy1_3","naecy1_4","naecy1_5","naecy1_6",
"naecy2_2","naecy2_1","naecy2_3","naecy2_4","naecy2_5","naecy2_6",
"naecy3_2","naecy3_1","naecy3_3","naecy3_4","naecy3_5","naecy3_6")
fwt[cols]<-lapply(fwt[cols], as.numeric) #convert to numeric all
fwt[is.na(cols)]<-0
本质上有三组date/description/minutes需要堆叠成一个长格式。我希望数据在重组后看起来像这样:
Name Date NAECY1 NAECY2 NAECY3 NAECY4 NAECY5 NAECY6
我已经尝试了 reshape2
和 tidyr
,但无法解决这个问题。有人有想法吗?
谢谢...
这是一个快速解决方案:
cols <- c("name", "date%d","descr%d", "naecy%d_1", "naecy%d_2", "naecy%d_3", "naecy%d_4", "naecy%d_5", "naecy%d_6")
cols_renamed <- c("Name Date Descr NAECY1 NAECY2 NAECY3 NAECY4 NAECY5 NAECY6") %>% strsplit("\W+") %>% unlist
new_fwt <- lapply(1:3, function(i) {
df <- fwt[,sprintf(cols, i)]
colnames(df) <- cols_renamed
df
}) %>% do.call(rbind, .)
我有一个广泛的数据集,其中每一行(一个人)提供 最多 三个不同日期的三个观察值。每个观察结果都包含日期、描述和分钟数。个人可以根据自己的意愿提供尽可能多的观察结果,并且可以出现在不止一行中并带有额外的观察结果。
测试数据在这里:
library(RCurl)
fwt <- getURL("https://raw.githubusercontent.com/bac3917/Cauldron/master/fwt.csv")
fwt<-read.csv(text=fwt)
以正确的格式转换列:
library(lubridate)
fwt$date1<-as.Date(fwt$date1, format='%m/%d/%Y')
fwt$date2<-as.Date(fwt$date2, format='%m/%d/%Y')
fwt$date3<-as.Date(fwt$date3, format='%m/%d/%Y')
# condense dataset; 3 sets of columns into 1
cols <- names(fwt) %in% c("naecy1_2","naecy1_1","naecy1_3","naecy1_4","naecy1_5","naecy1_6",
"naecy2_2","naecy2_1","naecy2_3","naecy2_4","naecy2_5","naecy2_6",
"naecy3_2","naecy3_1","naecy3_3","naecy3_4","naecy3_5","naecy3_6")
fwt[cols]<-lapply(fwt[cols], as.numeric) #convert to numeric all
fwt[is.na(cols)]<-0
本质上有三组date/description/minutes需要堆叠成一个长格式。我希望数据在重组后看起来像这样:
Name Date NAECY1 NAECY2 NAECY3 NAECY4 NAECY5 NAECY6
我已经尝试了 reshape2
和 tidyr
,但无法解决这个问题。有人有想法吗?
谢谢...
这是一个快速解决方案:
cols <- c("name", "date%d","descr%d", "naecy%d_1", "naecy%d_2", "naecy%d_3", "naecy%d_4", "naecy%d_5", "naecy%d_6")
cols_renamed <- c("Name Date Descr NAECY1 NAECY2 NAECY3 NAECY4 NAECY5 NAECY6") %>% strsplit("\W+") %>% unlist
new_fwt <- lapply(1:3, function(i) {
df <- fwt[,sprintf(cols, i)]
colnames(df) <- cols_renamed
df
}) %>% do.call(rbind, .)