自从温度如此寒冷以来如何计算时间

How to calculate time since temperature was so cold

我希望得到一些帮助来计算时间,因为温度和特定日期一样冷。

所以在下面的示例数据框中,对于第一个记录 (01/07/2000),上一次它和这个 (-1) 一样冷的时间是 01/01/2000(大约 182 天前)。

对于第二个记录,(01/06/2000) 上一次寒冷(2 度)是上个月(01/05/2000)实际上更冷(1 度)(所以大约 30 天前)。

df <- data.frame(date=as.Date(c("01/07/2000", "01/06/2000", "01/05/2000", 
                                "01/04/2000", "01/03/2000", "01/02/2000", 
                                "01/01/2000"), "%d/%m/%Y"), 
                 temperature =c(-1, 2, 1, 0, 1, 1, -1))

我曾尝试修改这种方法 (),但发现它在每周计算时变得笨拙。

您有什么想法可以计算自天气如此寒冷以来每周的天数吗?非常感谢,确实是您的帮助。

假设你有这样不同网格的温度数据,

#          date grid temp
# 1  2000-01-01    A   -1
# 2  2000-02-01    A   -1
# 3  2000-03-01    A   -1
# ...
# 10 2000-01-01    B    2
# 11 2000-02-01    B    1
# ...

您可以使用 by 沿网格执行拆分-应用-组合方法。在每个网格单元中,我们应用一个 Vectorized 函数,该函数计算自上次出现特定日期的温度以来的天数 difference。如果在它给出 NA.

之前没有事件
f <- Vectorize(function(data, x) {
  diff(rev(with(data, date[date <= x & temp == temp[date == x]]))[2:1])
}, vectorize.args="x")
res <- do.call(rbind, by(d, d$grid, function(g) cbind(g, last=f(g, g$date))))

res
#            date grid temp last
# A.1  2000-01-01    A   -1   NA
# A.2  2000-02-01    A   -1   31
# A.3  2000-03-01    A   -1   29
# A.4  2000-04-01    A   -1   31
# A.5  2000-05-01    A    0   NA
# A.6  2000-06-01    A    2   NA
# A.7  2000-07-01    A    0   61
# A.8  2000-08-01    A    0   31
# A.9  2000-09-01    A   -1  153
# B.10 2000-01-01    B    2   NA
# B.11 2000-02-01    B    1   NA
# B.12 2000-03-01    B    2   60
# B.13 2000-04-01    B    1   60
# B.14 2000-05-01    B    2   61
# B.15 2000-06-01    B   -1   NA
# B.16 2000-07-01    B   -1   30
# B.17 2000-08-01    B    0   NA
# B.18 2000-09-01    B    2  123
# C.19 2000-01-01    C    0   NA
# C.20 2000-02-01    C    0   31
# C.21 2000-03-01    C    1   NA
# C.22 2000-04-01    C    1   31
# C.23 2000-05-01    C   -1   NA
# C.24 2000-06-01    C   -1   31
# C.25 2000-07-01    C    1   91
# C.26 2000-08-01    C    2   NA
# C.27 2000-09-01    C   -1   92

编辑

要找出温度何时低于特定温度阈值temp.th,我们可以像这样修改函数:

temp.th <- 0
f2 <- Vectorize(function(data, x) {
  x - rev(with(data, date[date <= x & temp < temp.th]))[1]
}, vectorize.args="x")
res2 <- do.call(rbind, by(d, d$grid, function(g) cbind(g, last=f2(g, g$date))))

res2
#            date grid temp last
# A.1  2000-01-01    A   -1    0
# A.2  2000-02-01    A   -1    0
# A.3  2000-03-01    A   -1    0
# A.4  2000-04-01    A   -1    0
# A.5  2000-05-01    A    0   30
# A.6  2000-06-01    A    2   61
# A.7  2000-07-01    A    0   91
# A.8  2000-08-01    A    0  122
# A.9  2000-09-01    A   -1    0
# B.10 2000-01-01    B    2   NA
# B.11 2000-02-01    B    1   NA
# B.12 2000-03-01    B    2   NA
# B.13 2000-04-01    B    1   NA
# B.14 2000-05-01    B    2   NA
# B.15 2000-06-01    B   -1    0
# B.16 2000-07-01    B   -1    0
# B.17 2000-08-01    B    0   31
# B.18 2000-09-01    B    2   62
# C.19 2000-01-01    C    0   NA
# C.20 2000-02-01    C    0   NA
# C.21 2000-03-01    C    1   NA
# C.22 2000-04-01    C    1   NA
# C.23 2000-05-01    C   -1    0
# C.24 2000-06-01    C   -1    0
# C.25 2000-07-01    C    1   30
# C.26 2000-08-01    C    2   61
# C.27 2000-09-01    C   -1    0

数据:

d <- expand.grid(date=seq(as.Date("2000-01-01"), as.Date("2000-09-01"), by="month"),
            grid=LETTERS[1:3])
set.seed(42)
d$temp <- sample(-1:2, nrow(d), replace=T)

Base R 选项使用 sapply :

c(sapply(seq(nrow(df) - 1), function(x) {
  tmp <- -(1:x)
  inds <- which(df$temperature[x] >= df$temperature[tmp])[1]
  df$date[x] - df$date[tmp][inds]
}), NA)

#[1] 182  31  30  91  29  31  NA

这假设您的数据按降序排序,这意味着最新日期排在第一位(与您的示例数据相同)。


要按组应用这个,我们可以将上面的代码转换为函数:

diff_days <- function(temp, date) {
  c(sapply(seq_len(length(temp) - 1), function(x) {
    tmp <- -(1:x)
    inds <- which(temp[x] >= temp[tmp])[1]
    date[x] - date[tmp][inds]
  }), NA)  
}

library(dplyr)
df %>% 
  group_by(met_square) %>% 
  mutate(result = diff_days(temperature, date)) %>%
  ungroup

#    date       temperature met_square result
#   <date>           <dbl>      <dbl>  <dbl>
# 1 2000-07-01          -1          1    182
# 2 2000-06-01           2          1     31
# 3 2000-05-01           1          1     30
# 4 2000-04-01           0          1     91
# 5 2000-03-01           1          1     29
# 6 2000-02-01           1          1     31
# 7 2000-01-01          -1          1     NA
# 8 2000-07-01          -2          2     NA
# 9 2000-06-01           3          2     31
#10 2000-05-01           2          2     30
#11 2000-04-01           0          2     31
#12 2000-03-01          -1          2     60
#13 2000-02-01           2          2     31
#14 2000-01-01          -1          2     NA

这是工作代码,基于上面 Jay 的回答

require(data.table)


df <- data.frame(date=as.Date(c("01/07/2000", "01/06/2000", "01/05/2000", "01/04/2000", "01/03/2000", "01/02/2000", "01/01/2000", "01/07/2000", "01/06/2000", "01/05/2000", "01/04/2000", "01/03/2000", "01/02/2000", "01/01/2000"), "%d/%m/%Y"), 
                 temperature =c(-1, 2, 1, 0, 1, 1, -1, -2, 3, 2, 0, -1, 2, -1 ), 
                 met_square = c(1,1,1,1,1,1,1, 2,2,2,2,2,2,2))



setDT(df)

df3 <- df[order(date),]  # making sure the dates are in the right order



f <- Vectorize(function(data, x) {
  diff(rev(with(data, date[date <= x & temperature <= temperature[date == x]]))[2:1])
}, vectorize.args="x")



res <- do.call(rbind, by(df3, df3$met_square, function(g) cbind(g, last=f(g, g$date))))

res