如何在不影响 R 中数据框现有维度的情况下,通过操作(均值)按日期对数据进行分组?
How to group data by date by an operation (mean) without affecting the existing dimensions of the data frame in R?
给定以下数据集:
Hours<-c(2,3,4,2,1,1,3)
Project<-c("a","b","b","a","a","b","a")
Period<-c("2014-11-22","2014-11-23","2014-11-24","2014-11-22", "2014-11-23", "2014-11-23", "2014-11-24")
cd=data.frame(Project,Hours,Period)
我的目标是按日期 对时间进行分组,而不会 损害数据帧结构。查看目标:
Hours_goal<-c(2,1.6,3.5,2,1.6,1.6,3.5)
Project_goal<-c("a","b","b","a","a","b","a")
Period_goal<-c("2014-11-22","2014-11-23","2014-11-24","2014-11-22", "2014-11-23", "2014-11-23", "2014-11-24")
cd_goal=data.frame(Project_goal,Hours_goal,Period_goal)
正如您在上面看到的,项目和期间列没有变化,但最终目标是包含一天的平均小时数。例如,对于 2014-11-23,原始数据的值为 3,1 和 1。但这些值的平均值为 1.6。因此,已插入 1.6 代替此列中此日期的所有这些值。
尝试
cd$Hours <- with(cd, ave(Hours, Period, FUN = function(x) mean(x, na.rm=TRUE)))
names(cd) <- paste(names(cd), 'goal', sep="_")
或者
library(dplyr)
cd %>%
group_by(Period) %>%
mutate(Hours=mean(Hours, na.rm=TRUE))
或者
library(data.table)
setDT(cd)[, Hours:= mean(Hours, na.rm=TRUE), by=Period]
给定以下数据集:
Hours<-c(2,3,4,2,1,1,3)
Project<-c("a","b","b","a","a","b","a")
Period<-c("2014-11-22","2014-11-23","2014-11-24","2014-11-22", "2014-11-23", "2014-11-23", "2014-11-24")
cd=data.frame(Project,Hours,Period)
我的目标是按日期 对时间进行分组,而不会 损害数据帧结构。查看目标:
Hours_goal<-c(2,1.6,3.5,2,1.6,1.6,3.5)
Project_goal<-c("a","b","b","a","a","b","a")
Period_goal<-c("2014-11-22","2014-11-23","2014-11-24","2014-11-22", "2014-11-23", "2014-11-23", "2014-11-24")
cd_goal=data.frame(Project_goal,Hours_goal,Period_goal)
正如您在上面看到的,项目和期间列没有变化,但最终目标是包含一天的平均小时数。例如,对于 2014-11-23,原始数据的值为 3,1 和 1。但这些值的平均值为 1.6。因此,已插入 1.6 代替此列中此日期的所有这些值。
尝试
cd$Hours <- with(cd, ave(Hours, Period, FUN = function(x) mean(x, na.rm=TRUE)))
names(cd) <- paste(names(cd), 'goal', sep="_")
或者
library(dplyr)
cd %>%
group_by(Period) %>%
mutate(Hours=mean(Hours, na.rm=TRUE))
或者
library(data.table)
setDT(cd)[, Hours:= mean(Hours, na.rm=TRUE), by=Period]