动态定义 SAS 散列
Dynamically define a SAS hash
所以问题来了。
我有一个数据集,对于每条记录,我想根据条件加载不同的哈希值。我不知道我将在运行时加载的每个散列的确切散列结构。所以我希望能够有条件地执行 definedata
语句。但是由于不知道hash结构,所以想到了通过变量将参数传给definedata
语句,但是不行。我怎样才能做到这一点?这是我目前所拥有的:
/* Hashes have the same key field */
data hash1;
key = '1'; a = 10; b = 20; output;
key = '2'; a = 30; b = 40; output;
run;
/* Hash objects can have different data members and types */
data hash2;
key = '1'; x = 'AAA'; y = 'BBB'; output;
key = '2'; x = 'CCC'; y = 'DDD'; output;
run;
/* This the dataset I want to process */
/* hid specifies which hash I should lookup */
/* key contains the key value to use for the lookup */
/* def is the hash data definition piece of the hash.
In practice I will use another hash to retrieve this definition
But for simplicity we can assume that is part of the have dataset itself */
data have;
hid = '1'; key = '2'; def = "'a', 'b'"; output;
hid = '2'; key = '1'; def = "'x', 'y'"; output;
run;
/* This is what I want */
data want;
set have;
/* Though I don't know the structure of each hash, I can get a list of all hashes at the onset via some macro processing. So this statement is doable */
if _N_ = 0 then set hash1 hash2;
/* This part is OK. The hash declaration is able to accept a variable for the dataset name */
hashname = "hash" || hid;
declare hash hh(dataset: dsname);
hh.definekey('key');
/* The following line is the problematic piece */
hh.definedata(def);
hh.definedone();
rc = hh.find();
/* Do something with the values */
/* Finally delete the object so that it can be redefined again on the next record */
hh.delete();
run;
我得到的错误是:错误:未声明的数据符号 'a',散列对象 'b' 。我认为这里的问题是 defineddata 方法逐一解析变量,最终将整个字符串 'a', 'b'
视为一个变量。
如果我将散列定义为所有可能变量的超集,那么当我加载包含这些变量子集的数据集时它会报错。此外,我不能将散列定义为包含所有变量的超集(即我不能创建所有散列以包含 a、b、x 和 y 并保留无关元素)。
所以我的问题是我怎样才能完成我想在这里做的事情?是否可以仅使用数据步构造逐个提供每个变量来执行宏 %do 之类的迭代?或者还有其他方法吗?
约束条件
- 我不能依赖宏处理,因为我只知道我将在运行时使用哪个散列。
- 由于内存原因,我无法提前加载所有定义。
我们将不胜感激。
你的程序可以运行,但我认为性能会很差。
请注意,我更改了 DEF 的值以便更容易扫描。
data have;
hid = '1'; key = '2'; def = "a b"; output;
hid = '2'; key = '1'; def = "x y"; output;
run;
/* This is what I want */
data want;
if _N_ = 0 then set hash1 hash2;
call missing(of _all_);
set have;
hashname = "hash" || hid;
declare hash hh(dataset: hashname);
hh.definekey('key');
/* The following line is the problematic piece */
length v ;
do i = 1 by 1;
v = scan(def,i,' ');
putlog v= i=;
if missing(v) then leave;
*hh.definedata(def);
hh.definedata(v);
end;
hh.definedone();
*hh.output(dataset: cats('X',hashname));
rc = hh.find();
/* Do something with the values */
/* Finally delete the object so that it can be redefined again on the next record */
hh.delete();
run;
您可以将散列引用存储在单独的散列中。这称为 哈希的哈希 。使用对单个哈希的引用加载哈希的哈希 仅在步骤开始时加载一次。
示例:
data hash1;length key ;input
key a b; datalines;
1 10 20
2 30 40
3 50 60
4 70 80
run;
data hash2;length key ;input
key x . y: .; datalines;
1 AAA BBB
2 CCC DDD
3 EEE FFF
4 GGG HHH
run;
data hashdataspec; length hid ;input
hid datavars&: .;datalines;
1 a,b
2 x,y
run;
data have;
do rowid = 1 to 100;
p = floor (100*ranuni(123));
q = 100 + ceil(100*ranuni(123));
length r s ;
r = scan ("One of these will become the R value", ceil(8*ranuni(123)));
s = scan ("How much wood would a woodchuck chuck if ...", ceil(9*ranuni(123)));
length hid key ;
hid = substr('12', ceil(2*ranuni(123)));
key = substr('1234', ceil(4*ranuni(123)));
output;
end;
run;
data want;
sentinel0 = ' ';
if 0 then set hash1-hash2 hashdataspec; * prep pdv for hash host variables;
sentinel1 = ' ';
* prep hashes, one time only;
if _n_ = 1 then do;
* load hash data specifiers;
declare hash hds(dataset:'hashdataspec');
hds.defineKey('hid');
hds.defineData('hid', 'datavars');
hds.defineDone();
* prep hash of hashes;
declare hash h; /* dynamic hash that will be added to hoh */
declare hash hoh(); /* hash of hashes */
hoh.defineKey ('hid');
hoh.defineData ('h');
hoh.defineDone();
* loop over hashdataspec, loading dynamically created hashes;
declare hiter hi('hds');
do while(hi.next() = 0);
h = _new_ hash(dataset:cats('hash',hid)); * create dynamic hash;
h.defineKey('key');
do _n_ = 1 to countw(datavars);
h.defineData(scan(datavars,_n_,',')); * define data vars, one at a time;
end;
h.defineDone();
hoh.add(); * add the dynamic hash to the hash of hashes;
end;
end;
* clear hash host variables;
call missing (of sentinel0--sentinel1);
set have;
* lookup which hash (hid) to use
* this will select the appropriate dynamic hash from hoh and update hash variable h;
hoh.find();
* lookup data for key in the hids hash;
h.find();
drop datavars;
run;
所以问题来了。
我有一个数据集,对于每条记录,我想根据条件加载不同的哈希值。我不知道我将在运行时加载的每个散列的确切散列结构。所以我希望能够有条件地执行 definedata
语句。但是由于不知道hash结构,所以想到了通过变量将参数传给definedata
语句,但是不行。我怎样才能做到这一点?这是我目前所拥有的:
/* Hashes have the same key field */
data hash1;
key = '1'; a = 10; b = 20; output;
key = '2'; a = 30; b = 40; output;
run;
/* Hash objects can have different data members and types */
data hash2;
key = '1'; x = 'AAA'; y = 'BBB'; output;
key = '2'; x = 'CCC'; y = 'DDD'; output;
run;
/* This the dataset I want to process */
/* hid specifies which hash I should lookup */
/* key contains the key value to use for the lookup */
/* def is the hash data definition piece of the hash.
In practice I will use another hash to retrieve this definition
But for simplicity we can assume that is part of the have dataset itself */
data have;
hid = '1'; key = '2'; def = "'a', 'b'"; output;
hid = '2'; key = '1'; def = "'x', 'y'"; output;
run;
/* This is what I want */
data want;
set have;
/* Though I don't know the structure of each hash, I can get a list of all hashes at the onset via some macro processing. So this statement is doable */
if _N_ = 0 then set hash1 hash2;
/* This part is OK. The hash declaration is able to accept a variable for the dataset name */
hashname = "hash" || hid;
declare hash hh(dataset: dsname);
hh.definekey('key');
/* The following line is the problematic piece */
hh.definedata(def);
hh.definedone();
rc = hh.find();
/* Do something with the values */
/* Finally delete the object so that it can be redefined again on the next record */
hh.delete();
run;
我得到的错误是:错误:未声明的数据符号 'a',散列对象 'b' 。我认为这里的问题是 defineddata 方法逐一解析变量,最终将整个字符串 'a', 'b'
视为一个变量。
如果我将散列定义为所有可能变量的超集,那么当我加载包含这些变量子集的数据集时它会报错。此外,我不能将散列定义为包含所有变量的超集(即我不能创建所有散列以包含 a、b、x 和 y 并保留无关元素)。
所以我的问题是我怎样才能完成我想在这里做的事情?是否可以仅使用数据步构造逐个提供每个变量来执行宏 %do 之类的迭代?或者还有其他方法吗?
约束条件
- 我不能依赖宏处理,因为我只知道我将在运行时使用哪个散列。
- 由于内存原因,我无法提前加载所有定义。
我们将不胜感激。
你的程序可以运行,但我认为性能会很差。
请注意,我更改了 DEF 的值以便更容易扫描。
data have;
hid = '1'; key = '2'; def = "a b"; output;
hid = '2'; key = '1'; def = "x y"; output;
run;
/* This is what I want */
data want;
if _N_ = 0 then set hash1 hash2;
call missing(of _all_);
set have;
hashname = "hash" || hid;
declare hash hh(dataset: hashname);
hh.definekey('key');
/* The following line is the problematic piece */
length v ;
do i = 1 by 1;
v = scan(def,i,' ');
putlog v= i=;
if missing(v) then leave;
*hh.definedata(def);
hh.definedata(v);
end;
hh.definedone();
*hh.output(dataset: cats('X',hashname));
rc = hh.find();
/* Do something with the values */
/* Finally delete the object so that it can be redefined again on the next record */
hh.delete();
run;
您可以将散列引用存储在单独的散列中。这称为 哈希的哈希 。使用对单个哈希的引用加载哈希的哈希 仅在步骤开始时加载一次。
示例:
data hash1;length key ;input
key a b; datalines;
1 10 20
2 30 40
3 50 60
4 70 80
run;
data hash2;length key ;input
key x . y: .; datalines;
1 AAA BBB
2 CCC DDD
3 EEE FFF
4 GGG HHH
run;
data hashdataspec; length hid ;input
hid datavars&: .;datalines;
1 a,b
2 x,y
run;
data have;
do rowid = 1 to 100;
p = floor (100*ranuni(123));
q = 100 + ceil(100*ranuni(123));
length r s ;
r = scan ("One of these will become the R value", ceil(8*ranuni(123)));
s = scan ("How much wood would a woodchuck chuck if ...", ceil(9*ranuni(123)));
length hid key ;
hid = substr('12', ceil(2*ranuni(123)));
key = substr('1234', ceil(4*ranuni(123)));
output;
end;
run;
data want;
sentinel0 = ' ';
if 0 then set hash1-hash2 hashdataspec; * prep pdv for hash host variables;
sentinel1 = ' ';
* prep hashes, one time only;
if _n_ = 1 then do;
* load hash data specifiers;
declare hash hds(dataset:'hashdataspec');
hds.defineKey('hid');
hds.defineData('hid', 'datavars');
hds.defineDone();
* prep hash of hashes;
declare hash h; /* dynamic hash that will be added to hoh */
declare hash hoh(); /* hash of hashes */
hoh.defineKey ('hid');
hoh.defineData ('h');
hoh.defineDone();
* loop over hashdataspec, loading dynamically created hashes;
declare hiter hi('hds');
do while(hi.next() = 0);
h = _new_ hash(dataset:cats('hash',hid)); * create dynamic hash;
h.defineKey('key');
do _n_ = 1 to countw(datavars);
h.defineData(scan(datavars,_n_,',')); * define data vars, one at a time;
end;
h.defineDone();
hoh.add(); * add the dynamic hash to the hash of hashes;
end;
end;
* clear hash host variables;
call missing (of sentinel0--sentinel1);
set have;
* lookup which hash (hid) to use
* this will select the appropriate dynamic hash from hoh and update hash variable h;
hoh.find();
* lookup data for key in the hids hash;
h.find();
drop datavars;
run;