动态定义 SAS 散列

Dynamically define a SAS hash

所以问题来了。

我有一个数据集,对于每条记录,我想根据条件加载不同的哈希值。我不知道我将在运行时加载的每个散列的确切散列结构。所以我希望能够有条件地执行 definedata 语句。但是由于不知道hash结构,所以想到了通过变量将参数传给definedata语句,但是不行。我怎样才能做到这一点?这是我目前所拥有的:

/* Hashes have the same key field */
data hash1;
  key = '1';  a = 10; b = 20; output;
  key = '2';  a = 30; b = 40; output;
run;

/* Hash objects can have different data members and  types */
data hash2;
  key = '1';  x = 'AAA'; y = 'BBB'; output;
  key = '2';  x = 'CCC'; y = 'DDD'; output;
run;

/* This the dataset I want to process */
/* hid specifies which hash I should lookup */
/* key contains the key value to use for the lookup */
/* def is the hash data definition piece of the hash. 
   In practice I will use another hash to retrieve this definition
   But for simplicity we can assume that is part of the have dataset itself */

data have;
  hid = '1'; key = '2'; def = "'a', 'b'"; output;
  hid = '2'; key = '1'; def = "'x', 'y'"; output;
run;

/* This is what I want */

data want;
  set have;

  /* Though I don't know the structure of each hash, I can get a list of all hashes at the onset via some macro processing. So this statement is doable */
  if _N_ = 0 then set hash1 hash2;

  /* This part is OK. The hash declaration is able to accept a variable for the dataset name */

  hashname = "hash" || hid;
  declare hash hh(dataset: dsname);
  hh.definekey('key');

  /* The following line is the problematic piece */
  hh.definedata(def);

  hh.definedone();

  rc = hh.find();
  /* Do something with the values */

  /* Finally delete the object so that it can be redefined again on the next record */
  hh.delete();

run;

我得到的错误是:错误:未声明的数据符号 'a',散列对象 'b' 。我认为这里的问题是 defineddata 方法逐一解析变量,最终将整个字符串 'a', 'b' 视为一个变量。

如果我将散列定义为所有可能变量的超集,那么当我加载包含这些变量子集的数据集时它会报错。此外,我不能将散列定义为包含所有变量的超集(即我不能创建所有散列以包含 a、b、x 和 y 并保留无关元素)。

所以我的问题是我怎样才能完成我想在这里做的事情?是否可以仅使用数据步构造逐个提供每个变量来执行宏 %do 之类的迭代?或者还有其他方法吗?

约束条件

  1. 我不能依赖宏处理,因为我只知道我将在运行时使用哪个散列。
  2. 由于内存原因,我无法提前加载所有定义。

我们将不胜感激。

你的程序可以运行,但我认为性能会很差。

请注意,我更改了 DEF 的值以便更容易扫描。

data have;
   hid = '1'; key = '2'; def = "a b"; output;
   hid = '2'; key = '1'; def = "x y"; output;
   run;

/* This is what I want */

data want;
   if _N_ = 0 then set hash1 hash2;
   call missing(of _all_);
   set have;
   hashname = "hash" || hid;
   declare hash hh(dataset: hashname);
   hh.definekey('key');
   /* The following line is the problematic piece */
   length v ;
   do i = 1 by 1;
      v = scan(def,i,' ');
      putlog v= i=;
      if missing(v) then leave;
      *hh.definedata(def);
      hh.definedata(v);
      end;
   hh.definedone();
   *hh.output(dataset: cats('X',hashname));

   rc = hh.find();
   /* Do something with the values */

   /* Finally delete the object so that it can be redefined again on the next record */
   hh.delete();
   run;

您可以将散列引用存储在单独的散列中。这称为 哈希的哈希 。使用对单个哈希的引用加载哈希的哈希 仅在步骤开始时加载一次

示例:

data hash1;length key ;input
key a b; datalines;
1 10 20
2 30 40
3 50 60
4 70 80  
run;

data hash2;length key ;input
key x . y: .; datalines;
1 AAA BBB
2 CCC DDD
3 EEE FFF
4 GGG HHH
run;

data hashdataspec; length hid ;input
hid datavars&: .;datalines;
1   a,b
2   x,y
run;

data have;
  do rowid = 1 to 100;
    p = floor (100*ranuni(123));
    q = 100 + ceil(100*ranuni(123));

    length r s ;
    r = scan ("One of these will become the R value", ceil(8*ranuni(123)));
    s = scan ("How much wood would a woodchuck chuck if ...", ceil(9*ranuni(123)));

    length hid key ;
    hid = substr('12',   ceil(2*ranuni(123)));
    key = substr('1234', ceil(4*ranuni(123)));

    output;
  end; 
run;

data want;
  sentinel0 = ' ';
  if 0 then set hash1-hash2 hashdataspec; * prep pdv for hash host variables;
  sentinel1 = ' ';

  * prep hashes, one time only;
  if _n_ = 1 then do;
    * load hash data specifiers;
    declare hash hds(dataset:'hashdataspec');
    hds.defineKey('hid');
    hds.defineData('hid', 'datavars');
    hds.defineDone();

    * prep hash of hashes;
    declare hash h;      /* dynamic hash that will be added to hoh */
    declare hash hoh();  /* hash of hashes */
    hoh.defineKey ('hid');
    hoh.defineData ('h');
    hoh.defineDone();

    * loop over hashdataspec, loading dynamically created hashes;
    declare hiter hi('hds');
    do while(hi.next() = 0);
      h = _new_ hash(dataset:cats('hash',hid));    * create dynamic hash;
      h.defineKey('key');
      do _n_ = 1 to countw(datavars);
        h.defineData(scan(datavars,_n_,','));      * define data vars, one at a time;
      end;
      h.defineDone();
      hoh.add();  * add the dynamic hash to the hash of hashes;
    end;
  end;

  * clear hash host variables;
  call missing (of sentinel0--sentinel1);

  set have;

  * lookup which hash (hid) to use
  * this will select the appropriate dynamic hash from hoh and update hash variable h;
  hoh.find();

  * lookup data for key in the hids hash;
  h.find();

  drop datavars;
run;