按组划分的 sas 子集数据

Question

这可能是一个愚蠢的问题，但我很难解决这个问题。我有类似

的数据

animal firstcharacter
mouse  m
dog    d
cat    c
monkey m
donkey d

我想根据第一个字符将这个"original"数据分成几个数据集。

在这个例子中，我应该有 3 个组（c、d、m）。

如果我一个一个地做这个就很容易了：

data new_c; set original; if firstcharacter = "c" then; run;
data new_d; set original; if firstcharacter = "d" then; run;
data new_m; set original; if firstcharacter = "m" then; run;

问题是，我在实际数据中有数百个这样的组。有没有简单的方法（使用 do 循环或宏变量）来做到这一点？

谢谢。

Answer 1

使用哈希表很容易做到这一点。这是 'easy' 版本，它需要排序但不需要哈希的哈希或任何实际管理。

data have;
input animal $ firstcharacter $;
datalines;
mouse  m
dog    d
cat    c
monkey m
donkey d
;;;;
run;

proc sort data=have;  
  by firstcharacter;
run;

data _null_;
  set have;
  by firstcharacter;
  if _n_=1 then do;
    declare hash h;
  end;

  if first.firstcharacter then do;
    h = _new_ hash();
    h.defineKey('animal');
    h.defineData('animal','firstcharacter');
    h.defineDone();
  end;

  rc = h.add();

  if last.firstcharacter then do;
    rc = h.output(dataset:cats('new_',firstcharacter));
  end;
run;

使用散列的散列存在更复杂的方法（如果您想了解更多，请搜索）。

按组划分的 sas 子集数据

sas subset data by group

sas

sas-macro