将 data.frame 中不同工作日的日期列调整为一致的星期几

Adjusting date column in data.frame with differing weekdays to consistent Day of Week

我有一组按日期排列的球员网球排名

date <- as.Date(c("1973-08-23","1973-09-13","1973-09-26","1973-10-15","1973-10-31"))
ranking <- c(1,2,3,3,1)
df <- data.frame(date,ranking)

        date ranking
1 1973-08-23       1
2 1973-09-13       2
3 1973-09-26       3
4 1973-10-15       3
5 1973-10-31       1

它们大约每两周一次,但每天有所不同

library(lubridate)
wday(df$date) # [1] 5 5 4 2 4

我想根据上面的数据创建一个 data.frame(df2),以便每周一显示排名。结果将是

         date ranking
1  1973-08-27       1
2  1973-09-03       1
3  1973-09-10       1
4  1973-09-17       2
5  1973-09-24       2
6  1973-10-01       3
7  1973-10-08       3
8  1973-10-15       3
9  1973-10-22       3
10 1973-10-29       3
11 1973-11-05       1

wday(df2$date) # [1] 2 2 2 2 2 2 2 2 2 2 2

这有点简化,因为所有级别都有 PlayerA、PlayerB 等

非常感谢任何帮助

您必须创建一个所需大小的空数据框。然后你做一个 for 循环,你 运行 通过这个新的数据框,并用最新的可用排名填充它。每次有新的排名你就去取下一行。

为了更清楚:

j=1
   for(i in 1:length(output[,1])){
      if(as.numeric(output[i,1])>as.numeric(input[j,1])){
         j=j+1
      }
   output[i,2:10] = input[j,2:10]
}

其中 output 是一个数组,行数与星期一一样多,列数与玩家数一样多(日期 +1)

weekly_ranks <- function(df) {
  date <- df[,1]
  rank <- df[,2]
  start <- if(wday(min(date)) <= 2)     
  {min(date) +2-wday(min(date))
  } else {min(date) + 9-wday(min(date))} 
  end <- max(date)
  mondays <- seq(start, end, by=7)
  ranks <- match(as.character(cut(mondays, date)), as.character(date))
  data.frame(date=c(mondays, end), 
           ranking=c(rank[ranks], rank[df$date == end]))
}

weekly_ranks(Player_A)
#          date ranking
# 1  1973-08-27       1
# 2  1973-09-03       1
# 3  1973-09-10       1
# 4  1973-09-17       2
# 5  1973-09-24       2
# 6  1973-10-01       3
# 7  1973-10-08       3
# 8  1973-10-15       3
# 9  1973-10-22       3
# 10 1973-10-29       3
# 11 1973-10-31       1

同时为所有玩家,您可以:

lst <- list(Player_A, Player_B, Player_C)
lapply(lst, weekly_ranks)

理想的解决方案是 data.table 加入 roll=Inf

library(data.table)
df1 <- data.table(date=seq.Date(from=as.Date("1973-08-27"), to=as.Date("1973-11-05"), by=7))
setkey(setDT(df), date)
setkey(df1, date)
df[df1, roll=Inf]
#          date ranking
# 1: 1973-08-27       1
# 2: 1973-09-03       1
# 3: 1973-09-10       1
# 4: 1973-09-17       2
# 5: 1973-09-24       2
# 6: 1973-10-01       3
# 7: 1973-10-08       3
# 8: 1973-10-15       3
# 9: 1973-10-22       3
#10: 1973-10-29       3
#11: 1973-11-05       1