SAS:折叠和加权平均值计算

SAS: Collapsing and weighted averages calculations

我有一个 SAS 编程问题,我自己无法解决,非常感谢任何输入。

我想通过变量折叠数据集中的数据,summarize/average 两个变量基于另一个变量给出的权重并将它们相互减去:

示例数据

number   flag     volume   measure1  measure2
1         A         1         2         2        
2         B         2         4         5
3         A         5         8         20
4         B         10        4         1
5         A         9         10        11
6         B         5         2         9
7         A         4         11        23
8         B         3         1         8

现在:我要measure1和measure 2的体积加权平均值,然后计算measure1-measure2。然后将所有这些按标志 A 和 B 分组:

Number Flag      Volume       VolWeightMeasure1      VolWeightMeasure2      FinalMeasure
1        A        19        ((1/19)*2)+((5/19)*8)+...     ...            (VolWeightMeasure1-VolWeightMeasure2)
2        B        20        ((2/20)*5)+((10/20)*1)+...    ...            (VolWeightMeasure1-VolWeightMeasure2)

所以基本上是崩溃,但使用体积加权度量,然后减去两者。 感谢您的任何输入!

最佳

这可以在单个 datastep 中使用两个嵌套的 SET 语句(通常称为双 Do-Loop-of-Whitlock)来完成。

第一个循环聚合了 VOLUME 的值。 在第二个循环中计算公式。 每组只有一个值进入输出。

data have;
input  flag $ volume measure1 measure2;
datalines;
        A         1         2         2        
        B         2         4         5
        A         5         8         20
        B         10        4         1
        A         9         10        11
        B         5         2         9
        A         4         11        23
        B         3         1         8
run;

proc sort data = have; by flag; run;
data want;

  do _n_ = 1 by 1 until (last.flag);
    set have;
    by flag;

    sum_vol = sum(sum_vol,volume);
  end;

  do _n_ = 1 by 1 until (last.flag);
    set have;
    by flag;

    VolWeightMeasure1 = sum(VolWeightMeasure1,(volume/sum_vol)*measure1);
    VolWeightMeasure2 = sum(VolWeightMeasure2,(volume/sum_vol)*measure2);
  end;

  FinalMeasure = VolWeightMeasure1 - VolWeightMeasure2;  

drop volume measure1 measure2;
rename sum_vol = Volume;
run;
proc sql;
   select flag,sum_volume,sum1/sum_volume as volweightmeasure1,sum2/sum_volume as volweightmeasure2,
          calculated volweightmeasure1-calculated volweightmeasure2 as finalmeasure
   from (select flag,sum(volume) as sum_volume, sum(volume*measure1) as sum1, sum(volume*measure2) as sum2 from  have group by flag);
quit;

如果您对 proc summary/means 感到满意,您可以用它完成大部分的工作:

proc summary data=have nway;
  class flag;
  var measure1 measure2;
  wgt volume;
  output out=wantcomp(drop=_:) sumwgt=Volume mean=VolWeightMeasure1 VolWeightMeasure2;
run;

data want;
  set want;
  FinalMeasure = VolWeightMeasure1-VolWeightMeasure2;
run;