自从温度如此寒冷以来如何计算时间
How to calculate time since temperature was so cold
我希望得到一些帮助来计算时间,因为温度和特定日期一样冷。
所以在下面的示例数据框中,对于第一个记录 (01/07/2000),上一次它和这个 (-1) 一样冷的时间是 01/01/2000(大约 182 天前)。
对于第二个记录,(01/06/2000) 上一次寒冷(2 度)是上个月(01/05/2000)实际上更冷(1 度)(所以大约 30 天前)。
df <- data.frame(date=as.Date(c("01/07/2000", "01/06/2000", "01/05/2000",
"01/04/2000", "01/03/2000", "01/02/2000",
"01/01/2000"), "%d/%m/%Y"),
temperature =c(-1, 2, 1, 0, 1, 1, -1))
我曾尝试修改这种方法 (),但发现它在每周计算时变得笨拙。
您有什么想法可以计算自天气如此寒冷以来每周的天数吗?非常感谢,确实是您的帮助。
假设你有这样不同网格的温度数据,
# date grid temp
# 1 2000-01-01 A -1
# 2 2000-02-01 A -1
# 3 2000-03-01 A -1
# ...
# 10 2000-01-01 B 2
# 11 2000-02-01 B 1
# ...
您可以使用 by
沿网格执行拆分-应用-组合方法。在每个网格单元中,我们应用一个 Vectorize
d 函数,该函数计算自上次出现特定日期的温度以来的天数 diff
erence。如果在它给出 NA
.
之前没有事件
f <- Vectorize(function(data, x) {
diff(rev(with(data, date[date <= x & temp == temp[date == x]]))[2:1])
}, vectorize.args="x")
res <- do.call(rbind, by(d, d$grid, function(g) cbind(g, last=f(g, g$date))))
res
# date grid temp last
# A.1 2000-01-01 A -1 NA
# A.2 2000-02-01 A -1 31
# A.3 2000-03-01 A -1 29
# A.4 2000-04-01 A -1 31
# A.5 2000-05-01 A 0 NA
# A.6 2000-06-01 A 2 NA
# A.7 2000-07-01 A 0 61
# A.8 2000-08-01 A 0 31
# A.9 2000-09-01 A -1 153
# B.10 2000-01-01 B 2 NA
# B.11 2000-02-01 B 1 NA
# B.12 2000-03-01 B 2 60
# B.13 2000-04-01 B 1 60
# B.14 2000-05-01 B 2 61
# B.15 2000-06-01 B -1 NA
# B.16 2000-07-01 B -1 30
# B.17 2000-08-01 B 0 NA
# B.18 2000-09-01 B 2 123
# C.19 2000-01-01 C 0 NA
# C.20 2000-02-01 C 0 31
# C.21 2000-03-01 C 1 NA
# C.22 2000-04-01 C 1 31
# C.23 2000-05-01 C -1 NA
# C.24 2000-06-01 C -1 31
# C.25 2000-07-01 C 1 91
# C.26 2000-08-01 C 2 NA
# C.27 2000-09-01 C -1 92
编辑
要找出温度何时低于特定温度阈值temp.th
,我们可以像这样修改函数:
temp.th <- 0
f2 <- Vectorize(function(data, x) {
x - rev(with(data, date[date <= x & temp < temp.th]))[1]
}, vectorize.args="x")
res2 <- do.call(rbind, by(d, d$grid, function(g) cbind(g, last=f2(g, g$date))))
res2
# date grid temp last
# A.1 2000-01-01 A -1 0
# A.2 2000-02-01 A -1 0
# A.3 2000-03-01 A -1 0
# A.4 2000-04-01 A -1 0
# A.5 2000-05-01 A 0 30
# A.6 2000-06-01 A 2 61
# A.7 2000-07-01 A 0 91
# A.8 2000-08-01 A 0 122
# A.9 2000-09-01 A -1 0
# B.10 2000-01-01 B 2 NA
# B.11 2000-02-01 B 1 NA
# B.12 2000-03-01 B 2 NA
# B.13 2000-04-01 B 1 NA
# B.14 2000-05-01 B 2 NA
# B.15 2000-06-01 B -1 0
# B.16 2000-07-01 B -1 0
# B.17 2000-08-01 B 0 31
# B.18 2000-09-01 B 2 62
# C.19 2000-01-01 C 0 NA
# C.20 2000-02-01 C 0 NA
# C.21 2000-03-01 C 1 NA
# C.22 2000-04-01 C 1 NA
# C.23 2000-05-01 C -1 0
# C.24 2000-06-01 C -1 0
# C.25 2000-07-01 C 1 30
# C.26 2000-08-01 C 2 61
# C.27 2000-09-01 C -1 0
数据:
d <- expand.grid(date=seq(as.Date("2000-01-01"), as.Date("2000-09-01"), by="month"),
grid=LETTERS[1:3])
set.seed(42)
d$temp <- sample(-1:2, nrow(d), replace=T)
Base R 选项使用 sapply
:
c(sapply(seq(nrow(df) - 1), function(x) {
tmp <- -(1:x)
inds <- which(df$temperature[x] >= df$temperature[tmp])[1]
df$date[x] - df$date[tmp][inds]
}), NA)
#[1] 182 31 30 91 29 31 NA
这假设您的数据按降序排序,这意味着最新日期排在第一位(与您的示例数据相同)。
要按组应用这个,我们可以将上面的代码转换为函数:
diff_days <- function(temp, date) {
c(sapply(seq_len(length(temp) - 1), function(x) {
tmp <- -(1:x)
inds <- which(temp[x] >= temp[tmp])[1]
date[x] - date[tmp][inds]
}), NA)
}
library(dplyr)
df %>%
group_by(met_square) %>%
mutate(result = diff_days(temperature, date)) %>%
ungroup
# date temperature met_square result
# <date> <dbl> <dbl> <dbl>
# 1 2000-07-01 -1 1 182
# 2 2000-06-01 2 1 31
# 3 2000-05-01 1 1 30
# 4 2000-04-01 0 1 91
# 5 2000-03-01 1 1 29
# 6 2000-02-01 1 1 31
# 7 2000-01-01 -1 1 NA
# 8 2000-07-01 -2 2 NA
# 9 2000-06-01 3 2 31
#10 2000-05-01 2 2 30
#11 2000-04-01 0 2 31
#12 2000-03-01 -1 2 60
#13 2000-02-01 2 2 31
#14 2000-01-01 -1 2 NA
这是工作代码,基于上面 Jay 的回答
require(data.table)
df <- data.frame(date=as.Date(c("01/07/2000", "01/06/2000", "01/05/2000", "01/04/2000", "01/03/2000", "01/02/2000", "01/01/2000", "01/07/2000", "01/06/2000", "01/05/2000", "01/04/2000", "01/03/2000", "01/02/2000", "01/01/2000"), "%d/%m/%Y"),
temperature =c(-1, 2, 1, 0, 1, 1, -1, -2, 3, 2, 0, -1, 2, -1 ),
met_square = c(1,1,1,1,1,1,1, 2,2,2,2,2,2,2))
setDT(df)
df3 <- df[order(date),] # making sure the dates are in the right order
f <- Vectorize(function(data, x) {
diff(rev(with(data, date[date <= x & temperature <= temperature[date == x]]))[2:1])
}, vectorize.args="x")
res <- do.call(rbind, by(df3, df3$met_square, function(g) cbind(g, last=f(g, g$date))))
res
我希望得到一些帮助来计算时间,因为温度和特定日期一样冷。
所以在下面的示例数据框中,对于第一个记录 (01/07/2000),上一次它和这个 (-1) 一样冷的时间是 01/01/2000(大约 182 天前)。
对于第二个记录,(01/06/2000) 上一次寒冷(2 度)是上个月(01/05/2000)实际上更冷(1 度)(所以大约 30 天前)。
df <- data.frame(date=as.Date(c("01/07/2000", "01/06/2000", "01/05/2000",
"01/04/2000", "01/03/2000", "01/02/2000",
"01/01/2000"), "%d/%m/%Y"),
temperature =c(-1, 2, 1, 0, 1, 1, -1))
我曾尝试修改这种方法 (
您有什么想法可以计算自天气如此寒冷以来每周的天数吗?非常感谢,确实是您的帮助。
假设你有这样不同网格的温度数据,
# date grid temp
# 1 2000-01-01 A -1
# 2 2000-02-01 A -1
# 3 2000-03-01 A -1
# ...
# 10 2000-01-01 B 2
# 11 2000-02-01 B 1
# ...
您可以使用 by
沿网格执行拆分-应用-组合方法。在每个网格单元中,我们应用一个 Vectorize
d 函数,该函数计算自上次出现特定日期的温度以来的天数 diff
erence。如果在它给出 NA
.
f <- Vectorize(function(data, x) {
diff(rev(with(data, date[date <= x & temp == temp[date == x]]))[2:1])
}, vectorize.args="x")
res <- do.call(rbind, by(d, d$grid, function(g) cbind(g, last=f(g, g$date))))
res
# date grid temp last
# A.1 2000-01-01 A -1 NA
# A.2 2000-02-01 A -1 31
# A.3 2000-03-01 A -1 29
# A.4 2000-04-01 A -1 31
# A.5 2000-05-01 A 0 NA
# A.6 2000-06-01 A 2 NA
# A.7 2000-07-01 A 0 61
# A.8 2000-08-01 A 0 31
# A.9 2000-09-01 A -1 153
# B.10 2000-01-01 B 2 NA
# B.11 2000-02-01 B 1 NA
# B.12 2000-03-01 B 2 60
# B.13 2000-04-01 B 1 60
# B.14 2000-05-01 B 2 61
# B.15 2000-06-01 B -1 NA
# B.16 2000-07-01 B -1 30
# B.17 2000-08-01 B 0 NA
# B.18 2000-09-01 B 2 123
# C.19 2000-01-01 C 0 NA
# C.20 2000-02-01 C 0 31
# C.21 2000-03-01 C 1 NA
# C.22 2000-04-01 C 1 31
# C.23 2000-05-01 C -1 NA
# C.24 2000-06-01 C -1 31
# C.25 2000-07-01 C 1 91
# C.26 2000-08-01 C 2 NA
# C.27 2000-09-01 C -1 92
编辑
要找出温度何时低于特定温度阈值temp.th
,我们可以像这样修改函数:
temp.th <- 0
f2 <- Vectorize(function(data, x) {
x - rev(with(data, date[date <= x & temp < temp.th]))[1]
}, vectorize.args="x")
res2 <- do.call(rbind, by(d, d$grid, function(g) cbind(g, last=f2(g, g$date))))
res2
# date grid temp last
# A.1 2000-01-01 A -1 0
# A.2 2000-02-01 A -1 0
# A.3 2000-03-01 A -1 0
# A.4 2000-04-01 A -1 0
# A.5 2000-05-01 A 0 30
# A.6 2000-06-01 A 2 61
# A.7 2000-07-01 A 0 91
# A.8 2000-08-01 A 0 122
# A.9 2000-09-01 A -1 0
# B.10 2000-01-01 B 2 NA
# B.11 2000-02-01 B 1 NA
# B.12 2000-03-01 B 2 NA
# B.13 2000-04-01 B 1 NA
# B.14 2000-05-01 B 2 NA
# B.15 2000-06-01 B -1 0
# B.16 2000-07-01 B -1 0
# B.17 2000-08-01 B 0 31
# B.18 2000-09-01 B 2 62
# C.19 2000-01-01 C 0 NA
# C.20 2000-02-01 C 0 NA
# C.21 2000-03-01 C 1 NA
# C.22 2000-04-01 C 1 NA
# C.23 2000-05-01 C -1 0
# C.24 2000-06-01 C -1 0
# C.25 2000-07-01 C 1 30
# C.26 2000-08-01 C 2 61
# C.27 2000-09-01 C -1 0
数据:
d <- expand.grid(date=seq(as.Date("2000-01-01"), as.Date("2000-09-01"), by="month"),
grid=LETTERS[1:3])
set.seed(42)
d$temp <- sample(-1:2, nrow(d), replace=T)
Base R 选项使用 sapply
:
c(sapply(seq(nrow(df) - 1), function(x) {
tmp <- -(1:x)
inds <- which(df$temperature[x] >= df$temperature[tmp])[1]
df$date[x] - df$date[tmp][inds]
}), NA)
#[1] 182 31 30 91 29 31 NA
这假设您的数据按降序排序,这意味着最新日期排在第一位(与您的示例数据相同)。
要按组应用这个,我们可以将上面的代码转换为函数:
diff_days <- function(temp, date) {
c(sapply(seq_len(length(temp) - 1), function(x) {
tmp <- -(1:x)
inds <- which(temp[x] >= temp[tmp])[1]
date[x] - date[tmp][inds]
}), NA)
}
library(dplyr)
df %>%
group_by(met_square) %>%
mutate(result = diff_days(temperature, date)) %>%
ungroup
# date temperature met_square result
# <date> <dbl> <dbl> <dbl>
# 1 2000-07-01 -1 1 182
# 2 2000-06-01 2 1 31
# 3 2000-05-01 1 1 30
# 4 2000-04-01 0 1 91
# 5 2000-03-01 1 1 29
# 6 2000-02-01 1 1 31
# 7 2000-01-01 -1 1 NA
# 8 2000-07-01 -2 2 NA
# 9 2000-06-01 3 2 31
#10 2000-05-01 2 2 30
#11 2000-04-01 0 2 31
#12 2000-03-01 -1 2 60
#13 2000-02-01 2 2 31
#14 2000-01-01 -1 2 NA
这是工作代码,基于上面 Jay 的回答
require(data.table)
df <- data.frame(date=as.Date(c("01/07/2000", "01/06/2000", "01/05/2000", "01/04/2000", "01/03/2000", "01/02/2000", "01/01/2000", "01/07/2000", "01/06/2000", "01/05/2000", "01/04/2000", "01/03/2000", "01/02/2000", "01/01/2000"), "%d/%m/%Y"),
temperature =c(-1, 2, 1, 0, 1, 1, -1, -2, 3, 2, 0, -1, 2, -1 ),
met_square = c(1,1,1,1,1,1,1, 2,2,2,2,2,2,2))
setDT(df)
df3 <- df[order(date),] # making sure the dates are in the right order
f <- Vectorize(function(data, x) {
diff(rev(with(data, date[date <= x & temperature <= temperature[date == x]]))[2:1])
}, vectorize.args="x")
res <- do.call(rbind, by(df3, df3$met_square, function(g) cbind(g, last=f(g, g$date))))
res