一旦多个变量满足条件就放弃观察

Drop observations once condition is met by multiple variables

我有以下数据并使用现有的已回答问题之一来解决我的数据问题,但无法得到我想要的。这是我的数据

有:

       id   Date        Evt_Type   Flag   Amt1   Amt2
      101  2/2/2019      Fee              5
      101  2/3/2019      REF1      Y             5
      101  2/4/2019      Fee              10
      101  2/6/2019      REF2      Y             10
      101  2/7/2019      Fee               4
      101  2/8/2019      REF1
      102  2/2/2019      Fee              25
      102  2/2/2019      REF1      N      25
      103  2/3/2019      Fee              10
      103  2/4/2019      REF1      Y             10
      103  2/5/2019      Fee              10

想要:

      id   Date        Evt_Type   Flag   Amt1   Amt2
     101  2/2/2019      Fee              5
     101  2/3/2019      REF1      Y             5
     101  2/4/2019      Fee              10
     101  2/6/2019      REF2      Y             10
     101  2/7/2019      Fee               4
     101  2/8/2019      REF1
     102  2/2/2019      Fee              25
     102  2/2/2019      REF1      N      25
     103  2/4/2019      REF1      Y             10
     103  2/5/2019      Fee              10

我尝试了以下方法

data want;
    set have;
    by id Date;
    drop count;

    if (first.id or first.date) and FLAG='Y' then
        do;
            retain count;
            count=1;
            output;
            return;
        end;

    if count=1 and ((first.id or first.date) and Flag ne 'Y') then
        do;
            retain count;
            delete;
            return;
        end;
    output;
run;

感谢任何帮助。

谢谢

一种称为 DOW 循环 的技术可以执行以某种方式测量组的计算,然后在第二个循环中将该计算应用于组的成员。

DOW 依赖于循环内的 SET 语句。在这种情况下,计算是“组中的哪一行是具有 flag="Y" 的最后一行。

data want;
  * DOW loop, contains computation;

  _max_n_with_Y = 1e12;

  do _n_ = 1 by 1 until (last.id);
    set have;
    by id;
    if flag='Y' then _max_n_with_Y = _n_;
  end;

  * Follow up loop, applies computation;
  do _n_ = 1 to _n_;
    set have;
    if _n_ <= _max_n_with_Y then OUTPUT;
  end;
  drop _:;
run;

这是一种方法

data have;
input id $ Date : mmddyy10. Evt_Type $ Flag $ Amt1 Amt2;
format Date mmddyy10.;
infile datalines dsd missover;
datalines;
101,2/2/2019,Fee,,5,
101,2/3/2019,REF1,Y,,5
101,2/4/2019,Fee,,10,
101,2/6/2019,REF2,Y,,10
101,2/7/2019,Fee,,4,
102,2/2/2019,Fee,,25,
102,2/2/2019,REF1,N,25,
;

data want;
   do _N_ = 1 by 1 until (last.id);
      set have;
      by id;
      if flag = "Y" then _iorc_ = _N_;
   end;

   do _N_ = 1 to _N_;
      set have;
      if _N_ le _iorc_ then output;
   end;
   _iorc_=1e7;
run;