遍历 vector 并将元素转换为不带引号的变量名称

Iterate through vector and convert elements to unquoted variable names

以下宏在两个 table 之间进行内部联接,除了联接列之外,还包含每个 table 中的一列:

%macro ij(x=,y=,to=".default",xc=,yc=,by=);
  %if &to   = ".default" %then %let to = &from;
  PROC SQL;
    CREATE TABLE &to AS
    SELECT t1.&xc, t2.&yc, t1.&by
    FROM &x t1 INNER JOIN &y t2
    ON t1.&by = t2.&by;
  RUN;
%mend;

我想找到一种方法来使用 &xc&yc&by 中的多个列。 因为我认为我不能使用变量向量。

我的想法是将参数作为字符串向量而不是简单变量传递,例如 xc = {"col1" "col2"} 并循环遍历它们 使用 %let some_var= %sysfunc(dequote(&some_string)); 将它们转换回变量。

应用于 xc 只是它会变成这样:

%macro ij(x=,y=,to=".default",xc=,yc=,by=);
  %if &to   = ".default" %then %let to = &from;
  PROC SQL;
    CREATE TABLE &to AS
    SELECT 
    %do i = 1 %to %NCOL(&xc)
      %let xci = %sysfunc(dequote(&xc[1]));
      t1.&xci,
    %end;
    t2.&yc, t1.&by
    FROM &x t1 INNER JOIN &y t2
    ON t1.&by = t2.&by;
  RUN;
%mend;

但是这个循环失败了。我怎样才能让它发挥作用?

注意:这是一个简化的示例,我的最终目标是构建尽可能不冗长的连接宏并集成数据质量检查。

考虑简单的列表而不是向量。

将您的变量列表作为不带引号、space 分隔的值列表传递。这些值是可以作为标记扫描出来的 SAS 变量名称。

%macro ij (x=, ...);
  ...
  %local i token;
  %let i = 1;
    %do %while (%length(%scan(&X,&i)));
    %let token = %scan(&X,&i);
&token.,/* emit the token as source code */
    %let i = %eval(&i+1);
  %end;
  ...
%mend;

%ij ( x = one two three, ... )

一定要本地化所有宏变量,以防止在宏之外产生不必要的副作用。

为了保持一致性,我尝试使用模仿 SAS Procs 的 i/o 相关宏参数 -- data=out=file=、...

有人会说命名参数太冗长了!

如果您的 'proto-code' 期望 xci 符号是某种连续编号的变量,它不是。您必须使用 %local xc&i; %let xc&i= 进行分配,并使用 &&xc&i 进行解析。另外,您的原始代码引用了未通过的 &from

建筑很有趣。我还建议您查看过去的会议论文和 SAS 文献,寻找可能已经满足您目标的类似作品。

您可以从 space 分隔的列名列表开始,避免完全循环:

/*Define list of columns*/
%let COLS = A  B C;
%put COLS = &COLS;

/*Add table alias prefix*/
%let REGEX = %sysfunc(prxparse(s/(\S+)/t1./));
%let COLS = %sysfunc(prxchange(&REGEX,-1,&COLS));
%put COLS = &COLS;
%syscall prxfree(REGEX);

/*Condense multiple spaces to a single space*/
%let COLS = %sysfunc(compbl(&COLS));
%put COLS = &COLS;

/*Replace spaces with commas*/
%let COLS = %sysfunc(translate(&COLS,%str(,),%str( )));
%put COLS = &COLS;

真的,使用 SAS 数据集选项而不是构建复杂的宏逻辑来编写代码会容易得多。

proc sql ;
  create table want2 as
  select *
  from sashelp.class(keep=name age)
  natural inner join sashelp.class(keep=name height weight)
  ;
quit;

我建议学习如何使用数据步骤代码而不是 SQL 代码。对于大多数正常的数据操作,它更清晰、更简单。假设您想在变量 ID 上组合 IN1 和 IN2,并保留 IN1 中的变量 A 和 B 以及 IN2 中的变量 X 和 Y。

data out ;
  merge in1 in2 ;
  by id ;
  keep id a b x y ;
run;

其次,我会抵制生成过于复杂的宏代码网络的冲动。这将使下一个程序员更难理解程序。包括两周后的你自己。您的特定示例看起来不像是值得编码为宏的东西。您实际上并没有输入更少的信息,只是在 SQL 代码中包含 FROM 或 JOIN 等关键字的地方使用了几个逗号。

现在回答你的实际问题。要将值列表传递给宏,请使用分隔列表。尽可能使用 space 作为分隔符,但尤其要避免使用逗号作为分隔符。这将更容易键入,更容易传递到宏中并且更容易使用,因为它与 SAS 语言匹配,正如您在上面的数据步骤中看到的那样。如果你真的需要生成像 SQL 这样使用逗号的语法的代码,那么让宏代码在需要的地方生成它们。

%macro ij
(x=    /* First dataset name */
,y=    /* Second dataset name */
,by=   /* BY variable list */
,to=   /* Output dataset name. If empty use data step to generate DATAn work name */
,xc=   /* Variable list from first dataset */
,yc=   /* Variable list from second dataset */
);
%if not %length(&to) %then %do;
* Let SAS generate a name for new dataset ;
  data ; run;
  %let to=&syslast ;
  proc delete data=&to; run;
%end;
%if not %length(&xc) %then %let xc=*;
%if not %length(&yc) %then %let yx=*;
%local i sep ;
proc sql ;
 create table &to as
   select
%let sep= ;
%do i=1 %to %sysfunc(countw(&by)) ;
  &sep.T1.%scan(&by,&i)
  %let sep=,;
%end;
%do i=1 %to %sysfunc(countw(&xc)) ;
  &sep.T1.%scan(&xc,&i)
%end;
%do i=1 %to %sysfunc(countw(&yc)) ;
  &sep.T2.%scan(&yc,&i)
%end;
   from &x T1 inner join &y T2 on
%let sep= ;
%do i=1 %to %sysfunc(countw(&by)) ;
  &sep.T1.%scan(&by,&i)=T2.%scan(&by,&i)
  %let sep=,;
%end;
 ;
quit;
%mend ij  ;

试一试:

options mprint;
%ij(x=sashelp.class,y=sashelp.class,by=name,to=want,xc=age,yc=height weight);

SAS 日志:

MPRINT(IJ):   proc sql ;
MPRINT(IJ):   create table want as select T1.name ,T1.age ,T2.height ,T2.weight from sashelp.class
T1 inner join sashelp.class T2 on T1.name=T2.name ;
NOTE: Table WORK.WANT created, with 19 rows and 4 columns.

MPRINT(IJ):   quit;

最后正如@Tom 指出的那样,SAS 数据集选项更方便,使用它们不需要循环变量。

这是我附带的宏:

    *--------------------------------------------------------------------------------------------- ;
    * JOIN                                                                                         ;
    * Performs any join (defaults to inner join).                                                  ;
    * By default left table is overwritten (convenient for successive left joins)                  ;
    * Performs a natural join so columns should be renamed accordingly through 'rename' parameters ;
    *----------------------------------------------------------------------------------------------;

    %macro join
    (data1=   /* left table */
    ,data2=   /* right table */
    ,keep1=   /* columns to keep (default: keep all), don't use with drop */
    ,keep2=
    ,drop1=   /* columns to drop (default: none), don't use with keep */
    ,drop2=
    ,rename1= /* rename statement, such as 'old1 = new1 old2 = new2 */
    ,rename2=
    ,j=ij     /* join type, either ij lj or rj */
    ,out=     /* created table, by default data1 (left table is overwritten)*/
    );
    %if not %length(&out) %then %let out = &data1;
    %if %length(&keep1)   %then %let keep1 = keep=&keep1;
    %if %length(&keep2)   %then %let keep2 = keep=&keep2;
    %if %length(&drop1)   %then %let drop1 = drop=&drop1;
    %if %length(&drop2)   %then %let drop2 = drop=&drop2;
    %if %length(&rename1) %then %let rename1 = rename=(&rename1);
    %if %length(&rename2) %then %let rename2 = rename=(&rename2);
    %let kdr1 =;
    %let kdr2 =;
    %if (%length(&keep1) | %length(&drop1) | %length(&rename1)) %then %let kdr1 = (&keep1&drop1 &rename1);
    %if (%length(&keep2) | %length(&drop2) | %length(&rename2)) %then %let kdr2 = (&keep2&drop2 &rename2);
    %if &j=lj        %then %let j = LEFT JOIN;
    %if &j=ij        %then %let j = INNER JOIN;
    %if &j=rj        %then %let j = RIGHT JOIN;
    proc sql;
    create table &out as select *
    from &data1&kdr1 t1 natural &j &data2&kdr2 t2;
    quit;
    %mend;

可重现的例子:

data temp1;
input letter $ number1 $;
datalines;
a 1
a 2
a 3
b 4
c 8
;

data temp2;
input letter $ letter2 $ number2 $;
datalines;
a c 666
b d 0
;

* left join on common columns into new table temp3;
%join(data1=temp1,data2=temp2,j=lj,out=temp3)
* inner join by default, overwriting temp 1, after renaming to join on another column;
%join(data1=temp1,data2=temp2,drop2=letter,rename2= letter2=letter)