平滑曲线上的最大曲线变化率
Maximum rate of curve change on a smoothed curve
我有一个时间序列数据集,我正在尝试计算最大变化率以估计 NDVI 数据的绿化日期。我的数据如下:
date NDVI
1 2021-01-01 0.307
2 2021-01-17 0.291
3 2021-02-02 0.287
4 2021-02-18 0.247
5 2021-03-06 0.286
6 2021-03-22 0.268
7 2021-04-07 0.299
8 2021-04-23 0.295
9 2021-05-09 0.349
10 2021-05-25 0.402
11 2021-06-10 0.359
12 2021-06-26 0.432
13 2021-07-12 0.564
14 2021-07-28 0.654
15 2021-08-13 0.699
16 2021-08-29 0.614
17 2021-09-14 0.588
18 2021-09-30 0.553
19 2021-10-16 0.450
20 2021-11-01 0.377
21 2021-11-17 0.341
22 2021-12-03 0.331
23 2021-12-19 0.323
#I plot my dataset and fit a curve
p1 <- ggplot(data,aes(x = date, y = NDVI)) + stat_smooth(method = "lm", formula = y ~ ns(x,3), color="blue") + geom_point()
p1
我现在希望能够计算曲线变化的最大速率,以确定植被何时开始变绿(我根据该图猜测是在 5 月的某个时候)。
感谢您的帮助。
一种可能的方法是构建 ggplot,提取样条曲线,然后计算其控制点:
library(dplyr); library(ggplot2)
a <- mtcars %>%
ggplot(aes(drat, mpg)) +
stat_smooth(method = "lm", formula = y ~ splines::ns(x,3), color="blue") +
geom_point()
b <- ggplot_build(a)
spline_x <- b[["data"]][[1]][["x"]]
spline_y <- b[["data"]][[1]][["y"]]
change_rate <- c(NA, diff(spline_y)/ diff(spline_x))
a +
geom_line(data = data.frame(spline_x, change_rate),
aes(spline_x, change_rate), color = "red")
然后我们可以提取最大值 change_rate:
data.frame(spline_x, change_rate) %>%
filter(change_rate == max(change_rate, na.rm = TRUE))
spline_x change_rate
1 3.748861 10.68438
此模型的最大增长率出现在 2021 年 6 月 11 日 9:13pm。获得此结果的一种简单方法是在 ggplot 之外创建模型:
library(splines)
mod <- lm(NDVI ~ ns(date, 3), data = data)
现在用它来预测一年中每一天的 NDVI:
dates <- seq(as.Date("2021-01-01"), as.Date("2021-12-31"), by = "day")
predictions <- predict(mod, newdata = list(date = dates))
最大变化率出现在连续几天的预测之间的最大差异处:
dates[which.max(diff(predictions))]
# [1] "2021-06-11"
如果您需要精确到小时和分钟(根据数据点的数量这是不合理的),您可以这样做:
data$date <- as.POSIXct(data$date)
mod <- lm(NDVI ~ ns(date, 3), data = data)
dates <- seq(as.POSIXct("2021-01-01"), as.POSIXct("2021-12-31"), by = "min")
predictions <- predict(mod, newdata = list(date = dates))
dates[which.max(diff(predictions))]
#> [1] "2021-06-11 21:13:00 GMT"
Savitsky-Golay 过滤器(此处使用 signal::sgolayfilt()
)将平滑您的数据,并且可以选择 return m-th 导数。因此,您可以 运行 获得一阶导数,然后提取最大值的日期。您可以调整过滤器的 length
以确定您想要的平滑度与锯齿度。
library(tidyverse)
library(signal)
d <- structure(list(date = structure(c(18628L, 18644L, 18660L, 18676L, 18692L, 18708L, 18724L, 18740L, 18756L, 18772L, 18788L, 18804L, 18820L, 18836L, 18852L, 18868L, 18884L, 18900L, 18916L, 18932L, 18948L, 18964L, 18980L), class = c("IDate", "Date")), NDVI = c(0.307, 0.291, 0.287, 0.247, 0.286, 0.268, 0.299, 0.295, 0.349, 0.402, 0.359, 0.432, 0.564, 0.654, 0.699, 0.614, 0.588, 0.553, 0.45, 0.377, 0.341, 0.331, 0.323)), row.names = c(NA, -23L), class = "data.frame")
# add sg smooth and derivative
d <- d %>%
mutate(date = as.Date(date)) %>%
mutate(sg = sgolayfilt(NDVI, n = 9)) %>%
mutate(dydx = sgolayfilt(NDVI, n = 9, m = 1))
# plot original data with smooth and derivative and annotate date of max slope
d %>%
ggplot(aes(date)) +
geom_point(aes(y = NDVI)) +
geom_line(aes(y = sg), color = "blue", size = 1) +
geom_line(aes(y = dydx), color = "red", size = 1) +
geom_vline(aes(xintercept = date[which.max(dydx)]), linetype = "dashed", size = 1.5) +
annotate("text", x = min(d$date), y = max(d$NDVI), hjust = 0,
label = paste0("Max rate of change on: ", d$date[which.max(d$dydx)]))
由 reprex package (v2.0.1)
于 2022-04-01 创建
我有一个时间序列数据集,我正在尝试计算最大变化率以估计 NDVI 数据的绿化日期。我的数据如下:
date NDVI
1 2021-01-01 0.307
2 2021-01-17 0.291
3 2021-02-02 0.287
4 2021-02-18 0.247
5 2021-03-06 0.286
6 2021-03-22 0.268
7 2021-04-07 0.299
8 2021-04-23 0.295
9 2021-05-09 0.349
10 2021-05-25 0.402
11 2021-06-10 0.359
12 2021-06-26 0.432
13 2021-07-12 0.564
14 2021-07-28 0.654
15 2021-08-13 0.699
16 2021-08-29 0.614
17 2021-09-14 0.588
18 2021-09-30 0.553
19 2021-10-16 0.450
20 2021-11-01 0.377
21 2021-11-17 0.341
22 2021-12-03 0.331
23 2021-12-19 0.323
#I plot my dataset and fit a curve
p1 <- ggplot(data,aes(x = date, y = NDVI)) + stat_smooth(method = "lm", formula = y ~ ns(x,3), color="blue") + geom_point()
p1
我现在希望能够计算曲线变化的最大速率,以确定植被何时开始变绿(我根据该图猜测是在 5 月的某个时候)。
感谢您的帮助。
一种可能的方法是构建 ggplot,提取样条曲线,然后计算其控制点:
library(dplyr); library(ggplot2)
a <- mtcars %>%
ggplot(aes(drat, mpg)) +
stat_smooth(method = "lm", formula = y ~ splines::ns(x,3), color="blue") +
geom_point()
b <- ggplot_build(a)
spline_x <- b[["data"]][[1]][["x"]]
spline_y <- b[["data"]][[1]][["y"]]
change_rate <- c(NA, diff(spline_y)/ diff(spline_x))
a +
geom_line(data = data.frame(spline_x, change_rate),
aes(spline_x, change_rate), color = "red")
然后我们可以提取最大值 change_rate:
data.frame(spline_x, change_rate) %>%
filter(change_rate == max(change_rate, na.rm = TRUE))
spline_x change_rate
1 3.748861 10.68438
此模型的最大增长率出现在 2021 年 6 月 11 日 9:13pm。获得此结果的一种简单方法是在 ggplot 之外创建模型:
library(splines)
mod <- lm(NDVI ~ ns(date, 3), data = data)
现在用它来预测一年中每一天的 NDVI:
dates <- seq(as.Date("2021-01-01"), as.Date("2021-12-31"), by = "day")
predictions <- predict(mod, newdata = list(date = dates))
最大变化率出现在连续几天的预测之间的最大差异处:
dates[which.max(diff(predictions))]
# [1] "2021-06-11"
如果您需要精确到小时和分钟(根据数据点的数量这是不合理的),您可以这样做:
data$date <- as.POSIXct(data$date)
mod <- lm(NDVI ~ ns(date, 3), data = data)
dates <- seq(as.POSIXct("2021-01-01"), as.POSIXct("2021-12-31"), by = "min")
predictions <- predict(mod, newdata = list(date = dates))
dates[which.max(diff(predictions))]
#> [1] "2021-06-11 21:13:00 GMT"
Savitsky-Golay 过滤器(此处使用 signal::sgolayfilt()
)将平滑您的数据,并且可以选择 return m-th 导数。因此,您可以 运行 获得一阶导数,然后提取最大值的日期。您可以调整过滤器的 length
以确定您想要的平滑度与锯齿度。
library(tidyverse)
library(signal)
d <- structure(list(date = structure(c(18628L, 18644L, 18660L, 18676L, 18692L, 18708L, 18724L, 18740L, 18756L, 18772L, 18788L, 18804L, 18820L, 18836L, 18852L, 18868L, 18884L, 18900L, 18916L, 18932L, 18948L, 18964L, 18980L), class = c("IDate", "Date")), NDVI = c(0.307, 0.291, 0.287, 0.247, 0.286, 0.268, 0.299, 0.295, 0.349, 0.402, 0.359, 0.432, 0.564, 0.654, 0.699, 0.614, 0.588, 0.553, 0.45, 0.377, 0.341, 0.331, 0.323)), row.names = c(NA, -23L), class = "data.frame")
# add sg smooth and derivative
d <- d %>%
mutate(date = as.Date(date)) %>%
mutate(sg = sgolayfilt(NDVI, n = 9)) %>%
mutate(dydx = sgolayfilt(NDVI, n = 9, m = 1))
# plot original data with smooth and derivative and annotate date of max slope
d %>%
ggplot(aes(date)) +
geom_point(aes(y = NDVI)) +
geom_line(aes(y = sg), color = "blue", size = 1) +
geom_line(aes(y = dydx), color = "red", size = 1) +
geom_vline(aes(xintercept = date[which.max(dydx)]), linetype = "dashed", size = 1.5) +
annotate("text", x = min(d$date), y = max(d$NDVI), hjust = 0,
label = paste0("Max rate of change on: ", d$date[which.max(d$dydx)]))
由 reprex package (v2.0.1)
于 2022-04-01 创建