转置 table,同时折叠每个 BY 组的重复观察

Transposing table while collapsing duplicate observations per BY group

我有一个包含诊断记录的数据集,其中即使对于相同的代码,患者也可以有一个或多个记录。我无法按变量 'code' 使用分组,因为它显示的错误类似于 The ID value "code_v58" occurs twice in the same BY group.

data have;
    input id rand found  code $;
    datalines;            
    1   101      1      001
    2   102      1      v58
    2   103      0      v58  /* second diagnosis record for patient 2 */
    3   104      1      v58
    4   105      1      003
    4   106      1      003  /* second diagnosis record for patient 4 */
    5   107      0      v58
    ;

Desired output:
Obs id code_001 code_v58 code_003
1    1   1 . .
2    2   . 1 . /* second diagnosis code's {v58} status for patient 2 is 1, so it has to be taken*/
3    3   . 1 .
4    4   . . 1
5    5   . 0 .

当我尝试像 [this],

这样的 let 语句时
proc transpose data=temp out=want(drop=_name_) prefix=code_ let;
  by id;
  id code;   * column name becomes <prefix><code>;
  var found;
run;

我得到如下输出:

Obs id code_001 code_v58 code_003
1    1   1 . .
2    2   . 0 .
3    3   . 1 .
4    4   . . 1
5    5   . 0 .

我尝试了 this 并修改了 PROC TRANSPOSE 以在 BY 语句中使用 ID 和计数

proc transpose data=temp out=want(drop=_name_) prefix=code_;
      by id count;
      id code;   * column name becomes <prefix><code>;
      var found;
    run;

得到如下输出:

   Obs id count code_001 code_v58 code_003
    1   1  1     1 . .
    2   2  1     . 1 .
    3   2  2     . 0 .
    4   3  1     . 1 .
    5   4  1     . . 1
    6   4  2     . . 1
    7   5  1     . 0 .

我可以知道如何删除重复的患者 ID 并将代码更新为 1(如果在任何记录中找到)吗?

在我看来你想要这样的东西 - 首先预处理数据以获得你想要的 FOUND 值,然后转置(如果你确实需要)。 TABULATE 做了你想为 FOUND 做的事情(取它的最大值,如果存在则取 1,如果只有 0 则取 0,否则不存在),然后 TRANSPOSE 就像你之前做的那样。

proc tabulate data=have out=tab;
class id code;
var found;    
tables id,code*found*max;
run;

proc transpose data=tab out=want prefix=code_;
  by id;
  id code;
  var found_max;
run;

  

您可以transpose一组聚合view

proc sql; 
  create view have_v as
  select id, code, max(found) as found
  from have
  group by id, code
  order by id, code
;

proc transpose data=have_v out=want prefix=code_;
  by id;
  id code;
  var found;
run;

如果您想用 0

替换缺失值 (.),请跟进 Proc STDIZE(感谢@Reeza)
proc stdize data=want out=want missing=0 reponly;
var code_:;
run;