查找时间序列的平均变化
Find average change in timeseries
我有一个 15 年的年度平均时间序列数据集,我试图在这个时间序列中找到平均值 change/increase/decrease。
我的时间序列是空间的(每个 grid-cell/pixel 的平均值重复)。
我如何通过 dplyr 在 R 中执行此操作?
示例数据
year = c(2005, 2005, 2005, 2005, 2006, 2006, 2006, 2006, 2007, 2007, 2007, 2007, 2008, 2008, 2008, 2008)
Tmean = c(24, 24.5, 25.8,25, 24.8, 25, 23.5, 23.8, 24.8, 25, 25.2, 25.8, 25.3, 25.6, 25.2, 25)
代码
library(tidyverse)
df = data.frame(year, Tmean)
change = df$year %>%
# Sort by year
arrange(year) %>%
mutate(Diff_change = Tmean - lag(Tmean), # Difference in Tmean between years
Rate_percent = (Diff_change / year)/Tmean * 100) # Percent change # **returns inf values**
Average_change = mean(change$Rate_percent, na.rm = TRUE)
求平均值:mean()
。要查找差异或更改:diff()
因此,求平均变化:
> avg_change <- mean(diff(Tmean))
> print(avg_change)
[1] 0.06666667
如果你需要百分比,那么你想知道一个元素和它的前一个元素(今年 - 去年)相对于去年的百分比差异是多少,就像这样:
> pct_change <- Tmean[2:length(Tmean)] / Tmean[1:(length(Tmean)-1)] - 1
> avg_pct_change <- mean(pct_change) * 100
> print(avg_pct_change)
[1] 0.3101632
我们可以将这些向量放入数据框中以与 dplyr 一起使用(...如果您想要这样做;这对于 base R 也很简单)。
library(dplyr)
df <- data.frame(year, Tmean)
change <- df %>%
arrange(year) %>%
mutate(Diff_change = Tmean - lag(Tmean), # Difference in Tmean between years
Diff_time = year - lag(year),
Rate_percent = (Diff_change/Diff_time)/lag(Tmean) * 100) # Percent change
Average_change = mean(change$Rate_percent, na.rm = TRUE)
结果(包含更新的问题数据)
> change
year Tmean Diff_change Rate_percent
1 2005 24.0 NA NA
2 2005 24.5 0.5 2.0833333
3 2005 25.8 1.3 5.3061224
4 2005 25.0 -0.8 -3.1007752
5 2006 24.8 -0.2 -0.8000000
6 2006 25.0 0.2 0.8064516
7 2006 23.5 -1.5 -6.0000000
8 2006 23.8 0.3 1.2765957
9 2007 24.8 1.0 4.2016807
10 2007 25.0 0.2 0.8064516
11 2007 25.2 0.2 0.8000000
12 2007 25.8 0.6 2.3809524
13 2008 25.3 -0.5 -1.9379845
14 2008 25.6 0.3 1.1857708
15 2008 25.2 -0.4 -1.5625000
16 2008 25.0 -0.2 -0.7936508
> Average_change
[1] 0.3101632
我有一个 15 年的年度平均时间序列数据集,我试图在这个时间序列中找到平均值 change/increase/decrease。
我的时间序列是空间的(每个 grid-cell/pixel 的平均值重复)。
我如何通过 dplyr 在 R 中执行此操作?
示例数据
year = c(2005, 2005, 2005, 2005, 2006, 2006, 2006, 2006, 2007, 2007, 2007, 2007, 2008, 2008, 2008, 2008)
Tmean = c(24, 24.5, 25.8,25, 24.8, 25, 23.5, 23.8, 24.8, 25, 25.2, 25.8, 25.3, 25.6, 25.2, 25)
代码
library(tidyverse)
df = data.frame(year, Tmean)
change = df$year %>%
# Sort by year
arrange(year) %>%
mutate(Diff_change = Tmean - lag(Tmean), # Difference in Tmean between years
Rate_percent = (Diff_change / year)/Tmean * 100) # Percent change # **returns inf values**
Average_change = mean(change$Rate_percent, na.rm = TRUE)
求平均值:mean()
。要查找差异或更改:diff()
因此,求平均变化:
> avg_change <- mean(diff(Tmean))
> print(avg_change)
[1] 0.06666667
如果你需要百分比,那么你想知道一个元素和它的前一个元素(今年 - 去年)相对于去年的百分比差异是多少,就像这样:
> pct_change <- Tmean[2:length(Tmean)] / Tmean[1:(length(Tmean)-1)] - 1
> avg_pct_change <- mean(pct_change) * 100
> print(avg_pct_change)
[1] 0.3101632
我们可以将这些向量放入数据框中以与 dplyr 一起使用(...如果您想要这样做;这对于 base R 也很简单)。
library(dplyr)
df <- data.frame(year, Tmean)
change <- df %>%
arrange(year) %>%
mutate(Diff_change = Tmean - lag(Tmean), # Difference in Tmean between years
Diff_time = year - lag(year),
Rate_percent = (Diff_change/Diff_time)/lag(Tmean) * 100) # Percent change
Average_change = mean(change$Rate_percent, na.rm = TRUE)
结果(包含更新的问题数据)
> change
year Tmean Diff_change Rate_percent
1 2005 24.0 NA NA
2 2005 24.5 0.5 2.0833333
3 2005 25.8 1.3 5.3061224
4 2005 25.0 -0.8 -3.1007752
5 2006 24.8 -0.2 -0.8000000
6 2006 25.0 0.2 0.8064516
7 2006 23.5 -1.5 -6.0000000
8 2006 23.8 0.3 1.2765957
9 2007 24.8 1.0 4.2016807
10 2007 25.0 0.2 0.8064516
11 2007 25.2 0.2 0.8000000
12 2007 25.8 0.6 2.3809524
13 2008 25.3 -0.5 -1.9379845
14 2008 25.6 0.3 1.1857708
15 2008 25.2 -0.4 -1.5625000
16 2008 25.0 -0.2 -0.7936508
> Average_change
[1] 0.3101632