计算多年固定日期范围内R中数据的平均值

Calculate mean value of data in R on fixed date range for multiple years

我有一个40多年的每日温度数据。这是示例数据:

       date tmax
1   1971-01-01 18.9
2   1971-01-02 19.0
3   1971-01-03 19.5
4   1971-01-04 19.2
5   1971-01-05 19.5
.
.
.
17536   2020-12-29 19.7
17537   2020-12-30 18.9

我想计算作物生长期的温度平均值,即每年 6 月 7 日至 11 月 9 日。我们如何在 r 中做到这一点?

示例数据

library(dplyr)
set.seed(1)
d <- tibble(date = seq(as.Date("1971-1-1"), Sys.Date(), by = "day")) %>%
        mutate(tmax = round(rnorm(nrow(d), 20, 3), 1))

一个 tidyverse 解决方案

library(lubridate)
library(tidyr)
d %>% 
   mutate(is_crop_season = date %within% interval(ISOdate(year(date), 6, 7), 
                                                  ISOdate(year(date), 11, 9))) %>%
   group_by(is_crop_season, year = year(date)) %>%
   summarise(mean = mean(tmax)) %>%
   pivot_wider(year, 
               is_crop_season, 
               names_glue = "{ifelse(is_crop_season, 'crop_season', 'no_crop_season')}", 
               values_from = mean)

这里有一个选项-

从数据中获取日期、月份和年份,filter 并仅保留从 6 月 7 日到 11 月 9 日的行,并为每个 year 获取 tmax 值的平均值。

library(dplyr)
library(lubridate)

df %>%
  mutate(date = as.Date(date), 
         day = day(date), 
         month = month(date), 
         year = year(date)) %>%
  filter(between(month, 7, 10) | 
         day >= 7  & month == 6 | 
         day <= 9 & month == 11) %>%
  group_by(year) %>%
  summarise(tmax = mean(tmax, na.rm = TRUE))