SAS:了解滞后功能以根据工作进度保留日期

SAS : Understanding lag function to retain dates based on work progess

我有工作进展 sheet。 因此,如果我们有一个 table,工作进度为新的、进度、开始、结束和重新启动,一些规则是:

  1. 首先,当工作是 NEW 时,开始日期设置为“1/01/2013”​​,其他后续 work_progress 设置相同。

  2. 其次,如果工作结束并再次添加,开始日期设置为“01/01/2016”(下图:Work_id=3)。以下 work_progress 必须具有相同的值。

  3. 最后一个案例,当工作(work_id:1,2) RESTARTs 时,开始日期设置为接收工作的开始。以后的日期必须相同 '01/05/2017'。下面是我的逻辑输出的数据集。

文本缩进

+---------+---------------+-------------------+------------+------------+
| work_id | work_progress |   received_date   |   start    |    end     |
+---------+---------------+-------------------+------------+------------+
|       1 | NEW           | November 19, 2016 | 01/01/2013 | 31/12/2020 |
|       1 | PROGRESS      | December 25, 2016 | 01/01/2013 | 31/12/2020 |
|       1 | END           | January 1, 2017   | 01/01/2013 | 02/02/2017 |
|       1 | RESTART       | February 5, 2017  | 01/05/2017 | 31/12/2020 |
|       1 | PROGRESS      | March 20, 2017    | 01/01/2013 | 31/12/2020 |
|       2 | NEW           | November 19, 2016 | 01/01/2013 | 31/12/2020 |
|       2 | PROGRESS      | December 25, 2016 | 01/01/2013 | 31/12/2020 |
|       2 | END           | January 1, 2017   | 01/01/2013 | 31/12/2020 |
|       2 | RESTART       | February 5, 2017  | 01/05/2017 | 31/12/2020 |
|       2 | PROGRESS      | March 20, 2017    | 01/01/2013 | 31/12/2020 |
|       3 | NEW           | November 19, 2016 | 01/01/2013 | 31/12/2020 |
|       3 | END           | December 25, 2016 | 01/01/2013 | 02/02/2017 |
|       3 | NEW           | January 1, 2017   | 01/01/2016 | 31/12/2020 |
|       3 | END           | February 5, 2017  | 01/01/2013 | 02/02/2017 |
|       3 | END           | March 20, 2017    | 01/01/2013 | 03/03/2017 |
|       3 | END           | April 21, 2017    | 01/01/2013 | 04/04/2017 |
+---------+---------------+-------------------+------------+------------+  

实际上我的输出是什么:

+---------+---------------+-------------------+------------+------------+
| work_id | work_progress |   received_date   |   start    |    end     |
+---------+---------------+-------------------+------------+------------+
|       1 | NEW           | November 19, 2016 | 01/01/2013 | 31/12/2020 |
|       1 | PROGRESS      | December 25, 2016 | 01/01/2013 | 31/12/2020 |
|       1 | END           | January 1, 2017   | 01/01/2013 | 02/02/2017 |
|       1 | RESTART       | February 5, 2017  | 01/05/2017 | 31/12/2020 |
|       1 | PROGRESS      | March 20, 2017    | 01/05/2017 | 31/12/2020 |
|       2 | NEW           | November 19, 2016 | 01/01/2013 | 31/12/2020 |
|       2 | PROGRESS      | December 25, 2016 | 01/01/2013 | 31/12/2020 |
|       2 | END           | January 1, 2017   | 01/01/2013 | 31/12/2020 |
|       2 | RESTART       | February 5, 2017  | 01/05/2017 | 31/12/2020 |
|       2 | PROGRESS      | March 20, 2017    | 01/05/2017 | 31/12/2020 |
|       3 | NEW           | November 19, 2016 | 01/01/2013 | 31/12/2020 |
|       3 | END           | December 25, 2016 | 01/01/2013 | 02/02/2017 |
|       3 | NEW           | January 1, 2017   | 01/01/2016 | 31/12/2020 |
|       3 | END           | February 5, 2017  | 01/01/2016 | 02/02/2017 |
|       3 | END           | March 20, 2017    | 01/01/2016 | 02/02/2017 |
|       3 | END           | April 21, 2017    | 01/01/2016 | 02/02/2017 |
+---------+---------------+-------------------+------------+------------+    

要求:

  1. 当 NEW 和 重启。
  2. 在 work_id=3 和 work_progress= 结束日期。三月和四月 两者的结束日期都应该是 2 月

我需要在这里使用滞后来保留开始和结束日期。除了这个滞后使用部分外,我已经实现了一半的问题逻辑。 部分sas代码:

data m_out_ds;
 set m_in_ds;
 by work_id work_received_date;
 /*--------
 Some logic to derive my rules, that gave output, first table above.
  ----------*/
 prevstart = lag(start);
 prevend = lag(end); 
 prev_work_progress = lag(work_progress);

 if work_progress = 'END' and prev_work_progress = 'END' then end = prevend;

/*---This gave 02/02/2017 for march received date only, 
  but we require for april too, obvious the work has ended.----*/

if work_progress = 'PROGRESS' and prev_work_progress ='RESTART' 
  then start = prevstart; 

/*---This however worked---*/

run;

如果您无法理解这一点,请告诉我。 谢谢

这似乎符合您的数据,但我仍然不确定我是否理解规则。首先让我们把你的文字变成数据。

data have ;
  infile cards dsd dlm='|' truncover ;
  row+1;
  length work_id 8 work_progress  received_date start end 8 ;
  informat received_date anydtdte. start end ddmmyy.;
  format received_date  start end yymmdd10.;
  input work_id -- end ;
CARDS;
   1|NEW     | November 19, 2016|01/01/2013|31/12/2020 
   1|PROGRESS| December 25, 2016|01/01/2013|31/12/2020 
   1|END     | January 1, 2017  |01/01/2013|02/02/2017 
   1|RESTART | February 5, 2017 |01/05/2017|31/12/2020 
   1|PROGRESS| March 20, 2017   |01/01/2013|31/12/2020 
   2|NEW     | November 19, 2016|01/01/2013|31/12/2020 
   2|PROGRESS| December 25, 2016|01/01/2013|31/12/2020 
   2|END     | January 1, 2017  |01/01/2013|31/12/2020 
   2|RESTART | February 5, 2017 |01/05/2017|31/12/2020 
   2|PROGRESS| March 20, 2017   |01/01/2013|31/12/2020 
   3|NEW     | November 19, 2016|01/01/2013|31/12/2020 
   3|END     | December 25, 2016|01/01/2013|02/02/2017 
   3|NEW     | January 1, 2017  |01/01/2016|31/12/2020 
   3|END     | February 5, 2017 |01/01/2013|02/02/2017 
   3|END     | March 20, 2017   |01/01/2013|03/03/2017 
   3|END     | April 21, 2017   |01/01/2013|04/04/2017 
;
data want ;
  infile cards dsd dlm='|' truncover ;
  row+1;
  length work_id 8 work_progress  received_date start end 8 ;
  informat received_date anydtdte. start end ddmmyy.;
  format received_date  start end yymmdd10.;
  input work_id -- end ;
CARDS;
   1|NEW        |November 19, 2016|01/01/2013|31/12/2020 
   1|PROGRESS   |December 25, 2016|01/01/2013|31/12/2020 
   1|END        |January 1, 2017  |01/01/2013|02/02/2017 
   1|RESTART    |February 5, 2017 |01/05/2017|31/12/2020 
   1|PROGRESS   |March 20, 2017   |01/05/2017|31/12/2020 
   2|NEW        |November 19, 2016|01/01/2013|31/12/2020 
   2|PROGRESS   |December 25, 2016|01/01/2013|31/12/2020 
   2|END        |January 1, 2017  |01/01/2013|31/12/2020 
   2|RESTART    |February 5, 2017 |01/05/2017|31/12/2020 
   2|PROGRESS   |March 20, 2017   |01/05/2017|31/12/2020 
   3|NEW        |November 19, 2016|01/01/2013|31/12/2020 
   3|END        |December 25, 2016|01/01/2013|02/02/2017 
   3|NEW        |January 1, 2017  |01/01/2016|31/12/2020 
   3|END        |February 5, 2017 |01/01/2016|02/02/2017 
   3|END        |March 20, 2017   |01/01/2016|02/02/2017 
   3|END        |April 21, 2017   |01/01/2016|02/02/2017 
;

现在我们尝试转换它。

data try ;
  set have ;
  by work_id;
  retain new_start new_end ;
  format new_start new_end yymmdd10.;
  if first.work_id then call missing(of new_start new_end);
  if work_progress in ('NEW','RESTART') then new_start=start ;
  start=coalesce(new_start,start);
  if work_progress in ('END') then do;
    if missing(new_end) then new_end=end ;
    end=coalesce(new_end,end);
  end;
run;
proc compare data=want compare=try;
  id row;
run;
proc print data=try; run;