如何在R中获取最新的一行数据?

How to take the most recent row of data in R?

如果我有这个数据框:

tibble(
  period = c("2010END", "2011END", 
             "2010Q1","2010Q2","2010Q3","2010Q4","2010END",
             "2011Q1","2011Q2","2011Q3","2011Q4","2011END",
             "2011END","2012END"),
  date = c('31-12-2010','31-12-2011', '30-04-2010','31-07-2010','30-09-2010','30-11-2010', '31-12-2010',
           '30-04-2011','31-07-2011','30-09-2011','30-11-2011', '31-12-2011', 
           '31-12-2011', '31-12-2012'),
  website = c(
    "google",
    "google",
    "facebook",
    "facebook",
    "facebook",
    "facebook",
    "facebook",
    "facebook",
    "facebook",
    "facebook",
    "facebook",
    "facebook",
    "youtube",
    "youtube"
  ),
  values = c(1, 2, 1, 2, 3, NA, 5, NA, NA, NA, NA, 10, 20, NA)
)

我如何着手创建一个列来标识该组期间和网站的最新非 NA 行?

因此最终输出将如下所示:

tibble(
  period = c("2010END", "2011END", 
             "2010Q1","2010Q2","2010Q3","2010Q4","2010END",
             "2011Q1","2011Q2","2011Q3","2011Q4","2011END",
             "2011END","2012END"),
  date = c('31-12-2010','31-12-2011', '30-04-2010','31-07-2010','30-09-2010','30-11-2010', '31-12-2010',
           '30-04-2011','31-07-2011','30-09-2011','30-11-2011', '31-12-2011', 
           '31-12-2011', '31-12-2012'),
  website = c(
    "google",
    "google",
    "facebook",
    "facebook",
    "facebook",
    "facebook",
    "facebook",
    "facebook",
    "facebook",
    "facebook",
    "facebook",
    "facebook",
    "youtube",
    "youtube"
  ),
  values = c(1, 2, 1, 2, 3, NA, 5, NA, NA, NA, NA, 10, 20, NA), 
  most_recent = c('no','yes', 'no', 'no', 'no', 'no', 'no','yes','yes','yes','yes','yes','yes','no')
)

我正在尝试确定当按最近日期排序时期间和网站组的第一个非 na 值出现时,然后将此期间和网站的所有值标记为“是” most_recent列

所以你有以下内容:

我知道它涉及一个分组但不确定从那里去哪里

为清楚起见,已对此进行更新

下面的代码按网站选择最近的非 NA 行。
由于这不完全是您的预期结果,如评论中所建议,请在必要时澄清您的问题。

data[,most_recent:=fifelse(!is.na(values)&date==.SD[!is.na(values),max(date)],'yes','no'),by=website][]

    period       date  website values most_recent
 1: 2010END 31-12-2010   google      1          no
 2: 2011END 31-12-2011   google      2         yes
 3:  2010Q1 30-04-2010 facebook      1          no
 4:  2010Q2 31-07-2010 facebook      2          no
 5:  2010Q3 30-09-2010 facebook      3          no
 6:  2010Q4 30-11-2010 facebook     NA          no
 7: 2010END 31-12-2010 facebook      5          no
 8:  2011Q1 30-04-2011 facebook     NA          no
 9:  2011Q2 31-07-2011 facebook     NA          no
10:  2011Q3 30-09-2011 facebook     NA          no
11:  2011Q4 30-11-2011 facebook     NA          no
12: 2011END 31-12-2011 facebook     10         yes
13: 2011END 31-12-2011  youtube     20         yes
14: 2012END 31-12-2012  youtube     NA          no