SAS PROC 中的累计月度百分位计算 SQL

Question

我有一个 table 非常类似于以下格式：

ID | Month_ID | Param1 | Param2
1  |    1     |   5    |   10
1  |    1     |   6    |   12
1  |    2     |   4    |   9
1  |    2     |   8    |   15
2  |    1     |   3    |   17
2  |    1     |   5    |   12
2  |    2     |   3    |   11
2  |    2     |   6    |   10

我需要通过 ID 和 month_id 计算 param1 和 param2 的几个百分位数（50、75、85、90、95），但是对于每个月，我都需要包括上一个的所有数据个月（因此 month_id=2 将使用来自 month_id=1 和 month_id=2 的数据计算 param1 和 param2 的百分位数）。我试过使用 proc univariate，但我只能弄清楚如何使用以下代码为每个月获取它：

proc univariate data=table noprint;
by ID Month_ID NOTSORTED;
var param1 param2;
output out=Pctls pctlpts  = 50 75 85 90 95
                pctlpre  = param1_ param2_
                pctlname = pct50 pct75 pct85 pct90 pct95;

run;

有谁知道通过累积月份来计算这些百分位数的方法吗？提前致谢！

Answer 1

我想不出直接在 proc univariate 中执行此操作的方法，但我可能会按如下方式扩展和重新组合数据：

*dummy data ;
data input ;
  do ID=1 to 2 ;
    do month_id=1 to 12 ;
      parm1=int(ranuni(1)*100) ;
      parm2=int(ranuni(1)*100) ;
      output ;
    end ;
  end ;
run ;


data expand ;
  set input ;
  do group=12 to 1 by -1 ;
    if month_id le group then output ;
  end ;
run ;

这将为您提供一个组变量，其中 group=1 仅包含 month1，group=2 包含 month1 和 month2 等。

Answer 2

一种方法是通过创建参数的累积版本来预处理数据。此代码假定整个 table 已按您的示例中的顺序排序。它应该像 SQL group by 一样工作，它也在累积：

data accum_table;
    set table;
    by ID Month_ID;
    if first.ID then call missing (Accum1, Accum2);
    Accum1+Param1;
    Accum2+Param2;
    if last.Month_ID then output;
run;

Answer 3

这应该适用于自定义月份范围（尽管假设每个 ID 的月份都在最小值和最大值之间）：

具有不同月份范围的虚拟数据：

data input ;
  do ID=1 to 2 ;
    do month_id=3+ID to 20-ID*2 ;
      parm1=int(ranuni(1)*100) ;
      parm2=int(ranuni(1)*100) ;
      output ;
    end ;
  end ;
run ;

确定每个 ID 组的最小和最大月份：

proc sql ;
create table range as
  select *, min(month_id) as minmonth, max(month_id) as maxmonth
  from input
  group by ID 
  order by ID, month_id
;quit ;

将每个月输出到适当的组中：

data output ;
  set range ;
  by ID ;
  do group=month_id to maxmonth ;
    output ;
  end ;
run;

SAS PROC 中的累计月度百分位计算 SQL

Accumulated Monthly Percentile Calculation in SAS PROC SQL

sql

sas

proc-sql