获得达到所需数量所需的天数
Obtaining the number of Days it takes to reach the desired amount
我目前有一个数据框,其中包含站点名称、降雨日期、降雨量(附示例)我有兴趣探索每个站点到达所需的天数(and/or 个月)达到一定的降水量。
例如:
是否可以根据示例数据集获得上述输出?
我最初的想法是单独过滤每个站,加入一个日历数据框,从该范围中提取最小值和最大值,计算它们之间的天数并使用 case_when 对它们进行分类。这种方法似乎有点令人费解,如有任何关于更好方法的指导,我们将不胜感激。
感谢您的建议!
示例数据集:
Example <- structure(list(Name.Station = c("Station A", "Station A", "Station A",
"Station A", "Station A", "Station B", "Station B", "Station B",
"Station C", "Station C", "Station C", "Station C"), Rainfall.Date = c("7/10/2020",
"8/12/2020", "8/01/2021", "25/06/2021", "26/10/2021", "7/01/2020",
"22/01/2020", "5/02/2020", "5/09/2020", "5/10/2020", "5/11/2020",
"5/12/2020"), Rainfall.Amount = c(210, 210, 208.47, 208.16, 203.67,
227.49, 225, 222.54, 250, 250, 246.18, 245.15)), class = "data.frame", row.names = c(NA,
-12L))
by
站你可以计算cumsum
的降雨量大于mm的阈值。然后计算从开始日期到 cumsum 中最大日期的 seq
天数的 length
。
不过,首先,您的日期格式应该正确。
Example <- transform(Example, Rainfall.Date=as.Date(Rainfall.Date, '%d/%m/%Y'))
do.call(rbind, by(Example, Example$Name.Station, \(x) {
f <- \(mm, x.=x) {
mx <- which.max(cumsum(x.$Rainfall.Amount) > mm)
length(do.call(seq.Date, c(as.list(range(x.$Rainfall.Date[1:mx])), 1)))
}
ds <- seq.int(200, 1e3, 200) ## sequence of 200, 400, ... , 1000mm
r <- t(vapply(ds, f, 0))
data.frame(Name.Station=el(x$Name.Station), `colnames<-`(r, paste0('d_', ds)))
}))
# Name.Station d_200 d_400 d_600 d_800 d_1000
# Station A Station A 1 63 94 262 385
# Station B Station B 1 16 30 1 1
# Station C Station C 1 31 62 92 1
注意: R >= 4.1 使用。
这是一个tidyverse
方法:
library(dplyr)
library(tidyr)
Example %>%
group_by(Name.Station) %>%
mutate(Rainfall.Date = as.Date(Rainfall.Date, "%d/%m/%Y"),
days = cumsum(c(1, diff(Rainfall.Date))),
crainfall = cumsum(Rainfall.Amount),
fi = (findInterval(crainfall, seq(0, 1000, 200)) -1) * 200) %>%
pivot_wider(id_cols = Name.Station, names_from = fi, values_from = days, names_glue = {"days_to_{fi}_mm"}, values_fn = min)
# A tibble: 3 x 6
# Groups: Name.Station [3]
Name.Station days_to_200_mm days_to_400_mm days_to_600_mm days_to_800_mm days_to_1000_mm
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Station A 1 63 94 262 385
2 Station B 1 16 30 NA NA
3 Station C 1 31 62 92 NA
我目前有一个数据框,其中包含站点名称、降雨日期、降雨量(附示例)我有兴趣探索每个站点到达所需的天数(and/or 个月)达到一定的降水量。 例如:
是否可以根据示例数据集获得上述输出? 我最初的想法是单独过滤每个站,加入一个日历数据框,从该范围中提取最小值和最大值,计算它们之间的天数并使用 case_when 对它们进行分类。这种方法似乎有点令人费解,如有任何关于更好方法的指导,我们将不胜感激。
感谢您的建议!
示例数据集:
Example <- structure(list(Name.Station = c("Station A", "Station A", "Station A",
"Station A", "Station A", "Station B", "Station B", "Station B",
"Station C", "Station C", "Station C", "Station C"), Rainfall.Date = c("7/10/2020",
"8/12/2020", "8/01/2021", "25/06/2021", "26/10/2021", "7/01/2020",
"22/01/2020", "5/02/2020", "5/09/2020", "5/10/2020", "5/11/2020",
"5/12/2020"), Rainfall.Amount = c(210, 210, 208.47, 208.16, 203.67,
227.49, 225, 222.54, 250, 250, 246.18, 245.15)), class = "data.frame", row.names = c(NA,
-12L))
by
站你可以计算cumsum
的降雨量大于mm的阈值。然后计算从开始日期到 cumsum 中最大日期的 seq
天数的 length
。
不过,首先,您的日期格式应该正确。
Example <- transform(Example, Rainfall.Date=as.Date(Rainfall.Date, '%d/%m/%Y'))
do.call(rbind, by(Example, Example$Name.Station, \(x) {
f <- \(mm, x.=x) {
mx <- which.max(cumsum(x.$Rainfall.Amount) > mm)
length(do.call(seq.Date, c(as.list(range(x.$Rainfall.Date[1:mx])), 1)))
}
ds <- seq.int(200, 1e3, 200) ## sequence of 200, 400, ... , 1000mm
r <- t(vapply(ds, f, 0))
data.frame(Name.Station=el(x$Name.Station), `colnames<-`(r, paste0('d_', ds)))
}))
# Name.Station d_200 d_400 d_600 d_800 d_1000
# Station A Station A 1 63 94 262 385
# Station B Station B 1 16 30 1 1
# Station C Station C 1 31 62 92 1
注意: R >= 4.1 使用。
这是一个tidyverse
方法:
library(dplyr)
library(tidyr)
Example %>%
group_by(Name.Station) %>%
mutate(Rainfall.Date = as.Date(Rainfall.Date, "%d/%m/%Y"),
days = cumsum(c(1, diff(Rainfall.Date))),
crainfall = cumsum(Rainfall.Amount),
fi = (findInterval(crainfall, seq(0, 1000, 200)) -1) * 200) %>%
pivot_wider(id_cols = Name.Station, names_from = fi, values_from = days, names_glue = {"days_to_{fi}_mm"}, values_fn = min)
# A tibble: 3 x 6
# Groups: Name.Station [3]
Name.Station days_to_200_mm days_to_400_mm days_to_600_mm days_to_800_mm days_to_1000_mm
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Station A 1 63 94 262 385
2 Station B 1 16 30 NA NA
3 Station C 1 31 62 92 NA