如何计算不同行数据之间的差异?

How to calculate difference between data in different rows?

我有这种格式的每月数据

   PrecipMM          Date
    122.7         2004-01-01
     54.2         2005-01-01
     31.9         2006-01-01
    100.5         2007-01-01
    144.9         2008-01-01
     96.4         2009-01-01
     75.3         2010-01-01
     94.8         2011-01-01
     67.6         2012-01-01
     93.0         2013-01-01
    184.6         2014-01-01
    101.0         2015-01-01
    149.3         2016-01-01
     50.2         2004-02-01
     46.2         2005-02-01
     57.7         2006-02-01

我想计算 precipMM 在不同年份的同月的所有差异。

我的梦想输出是这样的:

   PrecipMM          Date         PrecipMM_diff
    122.7         2004-01-01           NA
     54.2         2005-01-01         -68.5
     31.9         2006-01-01         -22.3
    100.5         2007-01-01          68.6
    144.9         2008-01-01          44.4   
     96.4         2009-01-01         -48.5
     75.3         2010-01-01         -21.2
     94.8         2011-01-01          19.5
     67.6         2012-01-01         -27.2
     93.0         2013-01-01          25.4
    184.6         2014-01-01          91.6
    101.0         2015-01-01         -83.6 
    149.3         2016-01-01          48.3
     50.2         2004-02-01           NA
     46.2         2005-02-01          -4.0
     57.7         2006-02-01          11.5

我认为 diff() 可以做到这一点,但我不知道怎么做。

我认为您可以将 lagdplyr 中的 group_by 结合使用。方法如下:

library(dplyr)
library(lubridate)  # makes dealing with dates easier

# Load your example data
df <- structure(list(PrecipMM = c(4.4, 66.7, 48.2, 60.9, 108.1, 109.2, 
101.7, 38.1, 53.8, 71.9, 75.4, 67.1, 92.7, 115.3, 68.9, 38.9), 
    Date = structure(5:20, .Label = c("101.7", "108.1", "109.2", 
    "115.3", "1766-01-01", "1766-02-01", "1766-03-01", "1766-04-01", 
    "1766-05-01", "1766-06-01", "1766-07-01", "1766-08-01", "1766-09-01", 
    "1766-10-01", "1766-11-01", "1766-12-01", "1767-01-01", "1767-02-01", 
    "1767-03-01", "1767-04-01", "38.1", "38.9", "4.4", "48.2", 
    "53.8", "60.9", "66.7", "67.1", "68.9", "71.9", "75.4", "92.7"
    ), class = "factor")), class = "data.frame", row.names = c(NA, 
-16L), .Names = c("PrecipMM", "Date"))

results <- df %>% 
  mutate(years = year(Date), months = month(Date)) %>%
  group_by(months) %>%
  arrange(years) %>%
  mutate(lagged.rain = lag(PrecipMM), rain.diff = PrecipMM - lagged.rain)

results
# Source: local data frame [16 x 6]
# Groups: months [12]
# 
#    PrecipMM       Date years months lagged.rain rain.diff
#       (dbl)     (fctr) (dbl)  (dbl)       (dbl)     (dbl)
# 1       4.4 1766-01-01  1766      1          NA        NA
# 2      92.7 1767-01-01  1767      1         4.4      88.3
# 3      66.7 1766-02-01  1766      2          NA        NA
# 4     115.3 1767-02-01  1767      2        66.7      48.6
# 5      48.2 1766-03-01  1766      3          NA        NA
# 6      68.9 1767-03-01  1767      3        48.2      20.7
# 7      60.9 1766-04-01  1766      4          NA        NA
# 8      38.9 1767-04-01  1767      4        60.9     -22.0
# 9     108.1 1766-05-01  1766      5          NA        NA
# 10    109.2 1766-06-01  1766      6          NA        NA
# 11    101.7 1766-07-01  1766      7          NA        NA
# 12     38.1 1766-08-01  1766      8          NA        NA
# 13     53.8 1766-09-01  1766      9          NA        NA
# 14     71.9 1766-10-01  1766     10          NA        NA
# 15     75.4 1766-11-01  1766     11          NA        NA
# 16     67.1 1766-12-01  1766     12          NA        NA