SAS MACRO:创建许多数据集 - 修改它们 - 将它们组合成一个宏中的一个,而无需输出多个数据集
SAS MACRO: Create many datasets -modify them - combine them into one within one MACRO without need to ouput multiple datsets
我的初始数据集有 14000 个 STID 变量,每个变量有 10^5 个观察值。
我想通过每个 stid 制作一些程序,通过 STID 将修改输出到数据中,然后将所有 STID 一起设置为一个大数据集,而不需要输出所有临时 STID 数据集。
我开始写宏:
data HAVE;
input stid $ NumVar1 NumVar2;
datalines;
a 5 45
b 6 2
c 5 3
r 2 5
f 4 4
j 7 3
t 89 2
e 6 1
c 3 8
kl 1 6
h 2 3
f 5 41
vc 58 4
j 5 9
ude 7 3
fc 9 11
h 6 3
kl 3 65
b 1 4
g 4 4
;
run;
/* to save all distinct values of THE VARIABLE stid into macro variables
where &N_VAR - total number of distinct variable values */
proc sql;
select count(distinct stid)
into :N_VAR
from HAVE;
select distinct stid
into :stid1 - :stid%left(&N_VAR)
from HAVE;
quit;
%macro expand_by_stid;
/*STEP 1: create datasets by STID*/
%do i=1 %to &N_VAR.;
data stid&i;
set HAVE;
if stid="&&stid&i";
run;
/*STEP 2: from here data modifications for each STID-data (with procs and data steps, e.g.)*/
data modified_stid&i;
set stid&i;
NumVar1_trans=NumVar1**2;
NumVar2_trans=NumVar1*NumVar2;
run;
%end;
/*STEP 3: from here should be some code lines that set together all created datsets under one another and delete them afterwards*/
data total;
set %do n=1 %to &N_VAR.;
modified_stid&n;
%end;
run;
proc datasets library=usclim;
delete <ALL DATA SETS by SPID>;
run;
%mend expand_by_stid;
%expand_by_stid;
但是最后一步不行。我该怎么做?
你已经很接近了——你需要做的就是把宏循环中的分号去掉,放在第3步的%end
之后,如下:
data total;
set
%do n=1 %to &N_VAR.;
modified_stid&n
%end;;
run;
这会产生您想要的语句:
set modified_stid1 modified_stid2 .... ;
而不是您的宏最初生成的内容:
set modified_stid1; modified_stid2; ...;
最后,您可以在删除语句中使用stid:
删除所有临时数据集:
proc datasets library=usclim;
delete stid: ;
run;
我的初始数据集有 14000 个 STID 变量,每个变量有 10^5 个观察值。 我想通过每个 stid 制作一些程序,通过 STID 将修改输出到数据中,然后将所有 STID 一起设置为一个大数据集,而不需要输出所有临时 STID 数据集。
我开始写宏:
data HAVE;
input stid $ NumVar1 NumVar2;
datalines;
a 5 45
b 6 2
c 5 3
r 2 5
f 4 4
j 7 3
t 89 2
e 6 1
c 3 8
kl 1 6
h 2 3
f 5 41
vc 58 4
j 5 9
ude 7 3
fc 9 11
h 6 3
kl 3 65
b 1 4
g 4 4
;
run;
/* to save all distinct values of THE VARIABLE stid into macro variables
where &N_VAR - total number of distinct variable values */
proc sql;
select count(distinct stid)
into :N_VAR
from HAVE;
select distinct stid
into :stid1 - :stid%left(&N_VAR)
from HAVE;
quit;
%macro expand_by_stid;
/*STEP 1: create datasets by STID*/
%do i=1 %to &N_VAR.;
data stid&i;
set HAVE;
if stid="&&stid&i";
run;
/*STEP 2: from here data modifications for each STID-data (with procs and data steps, e.g.)*/
data modified_stid&i;
set stid&i;
NumVar1_trans=NumVar1**2;
NumVar2_trans=NumVar1*NumVar2;
run;
%end;
/*STEP 3: from here should be some code lines that set together all created datsets under one another and delete them afterwards*/
data total;
set %do n=1 %to &N_VAR.;
modified_stid&n;
%end;
run;
proc datasets library=usclim;
delete <ALL DATA SETS by SPID>;
run;
%mend expand_by_stid;
%expand_by_stid;
但是最后一步不行。我该怎么做?
你已经很接近了——你需要做的就是把宏循环中的分号去掉,放在第3步的%end
之后,如下:
data total;
set
%do n=1 %to &N_VAR.;
modified_stid&n
%end;;
run;
这会产生您想要的语句:
set modified_stid1 modified_stid2 .... ;
而不是您的宏最初生成的内容:
set modified_stid1; modified_stid2; ...;
最后,您可以在删除语句中使用stid:
删除所有临时数据集:
proc datasets library=usclim;
delete stid: ;
run;