为什么 DS2 中的扫描处理反斜杠与老式 SAS 不同?

why does scan in DS2 handle backslashes different from old school SAS?

我使用这些函数来计算单词 countw 和获取单词 scan 很多来分析完整的文件名。 (有兴趣的我一般用FILENAME docDir PIPE "dir ""&docRoot"" /B/S";

对于传统的 SAS,这适用于 UNIX 和 Windows:

data OLD_SCHOOL;
    format logic withSlash withBack secondSlash secondBack .;

    logic = 'OLD_SCHOOL';

    withSlash = 'Delimited/With/Slash';
    wordsSlash = countw(withSlash, '/');
    secondSlash = scan(withSlash, 2, '/');

    withBack = 'Delimited\With\Back';
    wordsBack = countw(withBack, '\');
    secondBack = scan(withBack, 2, '\');

    worksTheSame = wordsSlash eq wordsBack and secondSlash eq secondBack;

    put _all_;
run;

结果

    withSlash=Delimited/With/Slash secondSlash=With wordsSlash=3
    withBack=Delimited\With\Back   secondBack=With  wordsBack=3 
    worksTheSame=1 

使用更新的 DS2 语法,scan 和 countw 以不同方式处理反斜杠

proc ds2;
data DS2_SCHOOL / overwrite=yes;
    dcl double wordsSlash wordsBack worksTheSame;
    dcl char(20)logic withSlash withBack secondSlash secondBack;
    method init();
        logic = 'DB2_SCHOOL';

        withSlash = 'Delimited/With/Slash';
        wordsSlash = countw(withSlash, '/');
        secondSlash = scan(withSlash, 2, '/');

        withBack = 'Delimited\With\Back';
        wordsBack = countw(withBack, '\');
        secondBack = scan(withBack, 2, '\');

        worksTheSame = (wordsSlash eq wordsBack) and (secondSlash eq secondBack);       
    end;
enddata;
run;
quit;

data BOTH_SCHOOLS;
    set OLD_SCHOOL DS2_SCHOOL;
run;

结果

    withSlash=Delimited/With/Slash secondSlash=With wordsSlash=3
    withBack=Delimited\With\Back   secondBack=      wordsBack=1 
    worksTheSame=0

这是否有充分的理由,或者我应该将其作为错误报告给 SAS?

(在正则表达式中可能有一个link带反斜杠的作用。)

我在 9.3 中验证了这一点(缺少 overwrite=yes,作为旁注,很烦人):

proc ds2;
data DS2_SCHOOL ;
    dcl double wordsSlash wordsBack worksTheSame;
    dcl char(20)logic withSlash withBack secondSlash secondBack;
    method init();
        logic = 'DB2_SCHOOL';

        withSlash = 'Delimited/With/Slash';
        wordsSlash = countw(withSlash, '/');
        secondSlash = scan(withSlash, 2, '/');

        withBack = 'Delimited\With\Back';
        wordsBack = countw(withBack, '\');
        secondBack = scan(withBack, 2, '\');

        worksTheSame = (wordsSlash eq wordsBack) and (secondSlash eq secondBack);       
    end;
enddata;
run;
quit;

反斜杠似乎确实是一种转义 - 即使在您的原始字符串中,您也需要一对反斜杠。

从 9.4 TS1M3 开始不再是这种情况,因此不清楚在 9.3 TS1M2 和 9.4 TS1M3 之间更改的地方 and/or 已修复 - 不幸的是,它没有在任何更改日志中提及。

根据comments/verification,它似乎是 changed/fixed 在 9.4 TS1M2 中。

谢谢乔。为了进一步证明你做对了:如果我在旧学校数据步骤中指定我的字符串:

Data FROM_OLD_SCHOOL;
    delimiter = '/';
    fullName = 'Delimited/With/Slash';
    output;

    delimiter = '\';
    fullName = 'Delimited\With\Back';
    output;
run;

我可以在 DS2 数据步骤中完美地使用它们:

proc ds2;
data DS2_SCHOOL / overwrite=yes;
    dcl double partsPresent;
    dcl char(20) secondPart;
    method run();
        set FROM_OLD_SCHOOL;

        partsPresent = countw(fullName, delimiter);
        secondPart = scan(fullName, 2, delimiter);
    end;
enddata;
run;
quit;

结果

Obs partsPresent secondPart delimiter fullName 
1   3            With       /         Delimited/With/Slash 
2   3            With       \         Delimited\With\Back