确定哪些疾病聚集在一起

Question

如何确定哪些疾病聚集在一起？我有一个包含患者及其疾病的数据集。如果有，则编码为 HOHT = 1，如果没有，则编码为 HOHT = 0。

以下是数据示例。在不编写一堆 if then 语句的情况下，我将如何确定哪些疾病最常相互发生？目标是创建类似维恩图或树状图的东西来显示疾病的重叠。

Moya    Hypothyroid Hyperthyroid    Celiac
   1       1           0             0
   1       1           0             0       
   0       0           1             1
   0       0           0             0
   1       1           0             0
   1       0           1             0
   1       1           0             0
   1       1           0             0
   0       0           1             1
   0       0           1             1

Answer 1

我能想到的最简单的方法是通过 proc corr:

查看相关矩阵

data diseases;
input Moya    Hypothyroid Hyperthyroid    Celiac;
cards;
   1       1           0             0
   1       1           0             0       
   0       0           1             1
   0       0           0             0
   1       1           0             0
   1       0           1             0
   1       1           0             0
   1       1           0             0
   0       0           1             1
   0       0           1             1
  ; 
run;

proc corr data = diseases out = disease_corr; run;

还有其他各种选择，但我不确定这个问题是否真的最适合这个网站，因为它非常广泛，而且更多的是关于统计而不是编程。如果您运行遇到更具体的问题，请随时提出另一个问题。

确定哪些疾病聚集在一起

Determine which disease cluster together

statistics

bioinformatics

sas