如何在不影响 R 中数据框现有维度的情况下,通过操作(均值)按日期对数据进行分组?

How to group data by date by an operation (mean) without affecting the existing dimensions of the data frame in R?

给定以下数据集:

Hours<-c(2,3,4,2,1,1,3)
Project<-c("a","b","b","a","a","b","a")
Period<-c("2014-11-22","2014-11-23","2014-11-24","2014-11-22", "2014-11-23", "2014-11-23", "2014-11-24")
cd=data.frame(Project,Hours,Period)

我的目标是按日期 对时间进行分组,而不会 损害数据帧结构。查看目标:

Hours_goal<-c(2,1.6,3.5,2,1.6,1.6,3.5)
Project_goal<-c("a","b","b","a","a","b","a")
Period_goal<-c("2014-11-22","2014-11-23","2014-11-24","2014-11-22", "2014-11-23", "2014-11-23", "2014-11-24")
cd_goal=data.frame(Project_goal,Hours_goal,Period_goal)

正如您在上面看到的,项目和期间列没有变化,但最终目标是包含一天的平均小时数。例如,对于 2014-11-23,原始数据的值为 3,1 和 1。但这些值的平均值为 1.6。因此,已插入 1.6 代替此列中此日期的所有这些值。

尝试

cd$Hours <- with(cd, ave(Hours, Period, FUN = function(x) mean(x, na.rm=TRUE)))
names(cd) <- paste(names(cd), 'goal', sep="_")

或者

library(dplyr)
 cd %>% 
    group_by(Period) %>%
     mutate(Hours=mean(Hours, na.rm=TRUE))

或者

library(data.table)
setDT(cd)[, Hours:= mean(Hours, na.rm=TRUE), by=Period]