SAS Drop Condition - 为我的示例创建一个瀑布

SAS Drop Condition - Creating a Waterfall for my Sample

下午好, 我想从我的瀑布中删除这些观察结果,但输出中只显示第一个观察结果。我想放弃 SID 和 PID 中的所有观察结果,这就是我选择荒谬比率的原因。

请指教。谢谢!

此外,如何在输出框中的 drop_condition 观察栏中留出更多空间。有时它会截断我的文字。谢谢!

data temp;
set mydata.ames_housing_data;
format drop_condition .;

if (SID in (0:10000000)) then drop_condition = '01: SID';
else if (PID in (0:10000000)) then drop_condition = '02: PID';
else if (Neighborhood) then drop_condition = '03: Neighborhood';
else if (Zoning in ('A', 'C', 'FV', 'I')) then drop_condition = '04: Non-Residential Zoning';
else drop_condition = '05: Sample Population';

run;

proc freq data=temp;
tables drop_condition;
title 'Sample Waterfall';
run; quit;

日志

   1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
    55         
    56         data temp;
   57         set mydata.ames_housing_data;
   NOTE: Data file MYDATA.AMES_HOUSING_DATA.DATA is in a format that is        native to another host, or the file encoding does not match 
   the session encoding. Cross Environment Data Access will be used, which        might require additional CPU resources and might 
   reduce performance.
    58         format drop condition .;
    59         
    60         if (SID in (0:10000000)) then drop_condition = '01: SID';
    61         else if (PID in (0:10000000)) then drop_condition = '02: PID';
    62         else if (Neighborhood) then drop_condition = '03: Neighborhood';
    63         else if (Zoning in ('A', 'C', 'FV', 'I')) then drop_condition = '04: Non Residential Zoning';
    64         else drop_condition = '05: Sample Population';
    65         
    66         run;

    NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
   62:10   
    NOTE: Variable drop is uninitialized.
    NOTE: Variable condition is uninitialized.
    NOTE: There were 2930 observations read from the data set MYDATA.AMES_HOUSING_DATA.
    NOTE: The data set WORK.TEMP has 2930 observations and 85 variables.
    NOTE: DATA statement used (Total process time):
   real time           0.05 seconds
   cpu time            0.06 seconds


    67         
    68         proc freq data=temp;
    69         tables drop_condition;
    70         title 'Sample Waterfall';
    71         run;

    NOTE: There were 2930 observations read from the data set WORK.TEMP.
    NOTE: PROCEDURE FREQ used (Total process time):
   real time           0.06 seconds
   cpu time            0.06 seconds

    71       !      quit;

    72         
    73         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
    85         

由于您没有在 PROC FREQ 中使用 missing,这意味着您的所有数据都符合您的第一个条件。具有 IF/ELSE IF 的 IF 条件按顺序计算并在第一个 true 条件处停止。

您可以使用 proc freq 或 proc means 检查变量的分布。

proc means data=MYDATA.AMES_HOUSING_DATA min max;
var SID;
run;

编辑: 我认为你的第三个条件会按原样工作,但为了避免日志中的注释和更好的代码,我建议使用:

Else if not missing(neighbourhood) then...