"Lead" R 中的函数无法正常工作

"Lead" function in R not working correctly

我正在尝试在 R 上使用 lead 函数,其中 val.test 中对应于组 var 的最后一个值基本上成为 val.test.lead[=16= 中的第一个值]

set.seed(14)
df <- data.frame(
  var = c("A","B","C","D","E","F","G","H","I","J","K","L"),
  val.test = rnorm(12,4,5)
)

df$var <- as.factor(df$var)

df <- df %>% 
  dplyr::group_by(var) %>% 
  dplyr::mutate(val.test.lead = lead(val.test, default = first(val.test)))

#The output is

  var   val.test val.test.lead
   <fct>    <dbl>         <dbl>
 1 A        0.691         0.691
 2 B       12.6          12.6  
 3 C       14.6          14.6  
 4 D       11.5          11.5  
 5 E        3.82          3.82 
 6 F       10.2          10.2  
 7 G        3.68          3.68 
 8 H        9.34          9.34 
 9 I        2.12          2.12 
10 J        9.22          9.22 
11 K        2.09          2.09 
12 L        5.50          5.50 

#预期的输出是

  var   val.test val.test.lead
   <fct>    <dbl>         <dbl>
 1 A        0.691         5.50
 2 B       12.6           0.691 
 3 C       14.6           12.6
 4 D       11.5           14.6 
 5 E        3.82          11.5
 6 F       10.2           3.82 
 7 G        3.68          10.2
 8 H        9.34          3.68  
 9 I        2.12          9.34 
10 J        9.22          2.12 
11 K        2.09          9.22 
12 L        5.50          2.09 

这看起来你需要 lag() 而不是 lead()(我不明白你的 group_by() 在那里?)

df %>% 
  dplyr::mutate(val.test.lead = dplyr::lag(val.test, default = last(val.test)))

   var val.test val.test.lead
1    A     0.69          5.50
2    B    12.59          0.69
3    C    14.61         12.59
4    D    11.49         14.61
5    E     3.82         11.49
6    F    10.16          3.82
7    G     3.68         10.16
8    H     9.34          3.68
9    I     2.12          9.34
10   J     9.22          2.12
11   K     2.09          9.22
12   L     5.50          2.09

你要找的不是lead,而是lag.
leadlag

进行比较
set.seed(14)

df <- data.frame(
  var = c("A","B","C","D","E","F","G","H","I","J","K","L"),
  val.test = rnorm(12,4,5)
) %>% tibble()
df$var <- as.factor(df$var)
df %>% mutate(val.test.lag = lag(val.test, default=last(val.test)),
  val.test.lead = lead(val.test,default = first(val.test)))

输出

 # A tibble: 12 × 4
    var   val.test val.test.lag val.test.lead
    <fct>    <dbl>        <dbl>         <dbl>
  1 A        0.691        5.50         12.6  
  2 B       12.6          0.691        14.6  
  3 C       14.6         12.6          11.5  
  4 D       11.5         14.6           3.82 
  5 E        3.82        11.5          10.2  
  6 F       10.2          3.82          3.68 
  7 G        3.68        10.2           9.34 
  8 H        9.34         3.68          2.12 
  9 I        2.12         9.34          9.22 
 10 J        9.22         2.12          2.09 
 11 K        2.09         9.22          5.50 
 12 L        5.50         2.09          0.691

reprex package (v2.0.1)

于 2022-04-30 创建