对具有相同 ID 的观察结果进行分组
Grouping observations with same ID
我参加了一个项目,其中数据集将患者信息和疾病观察分开,但两个观察具有相同的 ID。示例:
ID:12345 患者年龄:23,患者体重:55,患者身高:180
ID:12345 疾病进展:A,疾病类型:abc,疾病风险:50
每位患者都会如此。
现在我想运行一些关于这个和那个的统计数据,据我所知,我必须在一次观察中整合这些信息,所以我们在一次观察中有病人特征和他们的疾病特征.最好的方法是什么?请记住,这两个观察结果具有互斥变量,因此可以以某种方式简单地将它们“分组”。
* Example generated by -dataex-. To install: ssc install dataex
clear
input str32 record_id byte(treatment gender pcos aneurysmal_finding sz_anu_a_basilaris sz_anu_a_basilaris_2 sz_anu_a_basilaris_4)
"8a36ac06e58a541430cd8b31df3aeef2" . 1 0 2 . . .
"8a36ac06e58a541430cd8b31df3aeef2" . . . . . . .
"2afc1f12901992a1f973cde814615349" . 1 0 2 . . .
"2afc1f12901992a1f973cde814615349" . . . . . . .
"1e00442745c25082a64197b96065f755" . 1 0 2 . . .
"1e00442745c25082a64197b96065f755" . . . . . . .
"c90aef04e29f38fc3e21b919d5106ce8" . 1 0 2 . . .
"c90aef04e29f38fc3e21b919d5106ce8" . . . . . . .
"7cac71f3d31c7e9ec26e6a885ad554ab" . 2 0 2 . . .
"7cac71f3d31c7e9ec26e6a885ad554ab" . . . . . . .
"53c1f08aff25ace9afc46aca3263e7ca" . 1 0 2 . . .
"53c1f08aff25ace9afc46aca3263e7ca" . . . . . . .
"cdbf4328e0724f30950e437bc6bbe262" . 2 0 2 . . .
"cdbf4328e0724f30950e437bc6bbe262" . . . . . . .
"50d722dca92aee72c39c846066850a22" 1 2 0 2 . . .
"50d722dca92aee72c39c846066850a22" . . . . . . .
"ffe78f8927a81a5521f098aa077a755f" . 1 0 1 . . .
"ffe78f8927a81a5521f098aa077a755f" . . . . . . .
"aa2309be5c9b76012462fce3f43a8249" . 1 0 1 . . .
"aa2309be5c9b76012462fce3f43a8249" . . . . . . .
"4917b3d300e195b895e573474be6ccb6" . 1 0 2 . . .
"4917b3d300e195b895e573474be6ccb6" . . . . . . .
"b88557884343831060297ff4b67aeb36" . 1 . 2 . . .
"b88557884343831060297ff4b67aeb36" . . . . . . .
"ebe8ab86719aa71b68d7f0df3e451ce5" . . . 2 . . .
"8dd5267472002c796ce621984f9024ed" . . . . . . 3
"0b3e110c9765e14a5c41fadcc3cfc300" . . . . . . .
"8f58545ef8d37f290d26881743137a72" 1 2 0 2 . . .
"8f58545ef8d37f290d26881743137a72" . . . . . . .
"dcb6a27d1d4f5f1228860a76fa29e5ba" . 1 0 2 . . .
"dcb6a27d1d4f5f1228860a76fa29e5ba" . . . . . . .
"baedce78f2e736fe4d54dbdbe0460694" . 2 0 2 . . .
"baedce78f2e736fe4d54dbdbe0460694" . . . . . . .
"bb1db3b0eca9652cff3c76060b06d60b" 1 2 0 2 . . .
"bb1db3b0eca9652cff3c76060b06d60b" . . . . . . .
"6741bd218feba9de630dfe409a4e50ee" 1 2 0 2 . . .
"6741bd218feba9de630dfe409a4e50ee" . . . . . . .
"1e1425d670466e1a2c6c752d9227df17" . 2 0 2 . . .
"1e1425d670466e1a2c6c752d9227df17" . . . . . . .
"4c6672a06addc8e01842d2741be1857d" . 1 0 2 . . .
"4c6672a06addc8e01842d2741be1857d" . . . . . . .
"f1be80fbb7e4e1f5582780e25bfc8a2c" . 2 0 2 . . .
"f1be80fbb7e4e1f5582780e25bfc8a2c" . . . . . . .
"9991ec586e5f510e161fcad93fb1d79f" . 1 0 2 . . .
"9991ec586e5f510e161fcad93fb1d79f" . . . . . . .
"5c1eb56eccf9cf67ae6065f82b6eb6ce" . 1 0 2 . . .
"5c1eb56eccf9cf67ae6065f82b6eb6ce" . . . . . . .
"f9d10d2eb1951fa2ebc8b0509bb25593" . 1 0 2 . . .
"f9d10d2eb1951fa2ebc8b0509bb25593" . . . . . . .
"fbdf663512805caffe7a99d14fc9561f" . 2 0 2 . . .
"fbdf663512805caffe7a99d14fc9561f" . . . . . . .
"3b55aebe1b4b22e0c77168acc4b775dd" . 1 0 2 . . .
"3b55aebe1b4b22e0c77168acc4b775dd" . . . . . . .
"5f28194ddef4f9d057db2e4fcb7b5cf0" . 1 0 2 . . .
"5f28194ddef4f9d057db2e4fcb7b5cf0" . . . . . . .
"0b8d8253a8415275dbc2619e039985bb" . 1 0 2 5 . .
"0b8d8253a8415275dbc2619e039985bb" . . . . . . .
"4fb152c8524750b65b6717282cceb805" . 1 0 2 . . .
"4fb152c8524750b65b6717282cceb805" . . . . . . .
"ff5136e64c2110c355debca6acb74a13" . 1 0 2 . . .
"ff5136e64c2110c355debca6acb74a13" . . . . . . .
"29534fe6f18b75090b9d18f853ed7ec1" . 1 0 2 5 5 .
"29534fe6f18b75090b9d18f853ed7ec1" . . . . . . .
"8c334d2225db0661b25cf5f2c65fbcb9" . 1 0 2 . . .
"8c334d2225db0661b25cf5f2c65fbcb9" . . . . . . .
"68cf4b9f2db11cb9cf44fd0e03c53f16" . 2 . 2 . . .
"68cf4b9f2db11cb9cf44fd0e03c53f16" . . . . . . .
"6a44e65e7b1f33a3603acf2532bb40f9" . 1 0 2 . . .
"6a44e65e7b1f33a3603acf2532bb40f9" . . . . . . .
"2ed013748bf88df47c39d83bd48d8040" . 1 0 2 . . .
"2ed013748bf88df47c39d83bd48d8040" . . . . . . .
"c2f32f5b61b97d658f7b042b49b8da96" . 1 0 2 . . .
"c2f32f5b61b97d658f7b042b49b8da96" . . . . . . .
"58e1e0b5c29dee7d3739ec582d62b84c" . . . . . . .
"58e1e0b5c29dee7d3739ec582d62b84c" . . . . . . .
"8635b098d70b200fe8eef5dbf7c1c156" . 2 0 2 . . .
"8635b098d70b200fe8eef5dbf7c1c156" . . . . . . .
"266f1f1517fb50bafca92fff39c259d5" . 1 0 2 . . .
"266f1f1517fb50bafca92fff39c259d5" . . . . . . .
"d3df754a7322c02ed89f1208977a19ae" 1 2 0 2 5 . .
"d3df754a7322c02ed89f1208977a19ae" . . . . . . .
"46598c5d2da10731582d6342944e9337" . 1 0 2 . . .
"46598c5d2da10731582d6342944e9337" . . . . . . .
"8c2c5aa9b02eb1092b34cf38c2b1c83d" . 1 0 2 2 . .
"8c2c5aa9b02eb1092b34cf38c2b1c83d" . . . . . . .
"797c9cf7caf53f514f0154f34895fa80" . 1 0 2 . . .
"797c9cf7caf53f514f0154f34895fa80" . . . . . . .
"9b28a68095c520edcb56bee8aa5737b6" . 1 0 2 . . .
"9b28a68095c520edcb56bee8aa5737b6" . . . . . . .
"09e03748da35e9d799dc5d8ddf1909b5" . 1 0 2 . . .
"09e03748da35e9d799dc5d8ddf1909b5" . . . . . . .
"75d5574d8804d24932e3d0d9cbfa4b11" . 1 0 2 . . .
"75d5574d8804d24932e3d0d9cbfa4b11" . . . . . . .
"b5bda504efd4bd3b3be68513ccbf99ef" . 1 0 2 . . .
"b5bda504efd4bd3b3be68513ccbf99ef" . . . . . . .
"dc289c2a5a31355521dde31c4abd4c83" . 1 0 2 . . .
"dc289c2a5a31355521dde31c4abd4c83" . . . . . . .
"76ce83dbd64f05556e903deb54959d22" . 1 0 2 . . .
"76ce83dbd64f05556e903deb54959d22" . . . . . . .
"830ee6dd656938201f4a712607739768" . 1 0 2 . . .
end
label values treatment treatment_
label def treatment_ 1 "I blodfortyndende behandling", modify
label values gender gender_
label def gender_ 1 "Kvinde", modify
label def gender_ 2 "Mand", modify
label values pcos pcos_
label def pcos_ 0 "Nej", modify
label values aneurysmal_finding aneurysmal_finding_
label def aneurysmal_finding_ 1 "Screening", modify
label def aneurysmal_finding_ 2 "Tilfældig fund", modify
label values sz_anu_a_basilaris sz_anu_a_basilaris_
label def sz_anu_a_basilaris_ 2 "7-12 mm", modify
label def sz_anu_a_basilaris_ 5 "Uoplyst", modify
label values sz_anu_a_basilaris_2 sz_anu_a_basilaris_2_
label def sz_anu_a_basilaris_2_ 5 "Uoplyst", modify
label values sz_anu_a_basilaris_4 sz_anu_a_basilaris_4_
label def sz_anu_a_basilaris_4_ 3 "13-25 mm", modify
基本上我的问题可以使用(假设此语法正确)来解决:
gen obs1 = .
replace every variableValue in obs1 == variableValue in 12345
然后将其迭代 1000 次观察..
假设您已经加载了两个数据集。
第一个数据集称为数据集 1。
第二个数据集称为数据集 2。
而您的 ID 属性称为 id
然后你可以在id上调用merge它们。
use dataset1, clear
merge id using dataset2
假设每个标识符最多由 2 个观察值表示,并且每个其他变量都是数字并且由最多 1 个观察值中的非缺失值表示,您可以 collapse
。默认的 collapse
方法应该可以正常工作。
ds record_id, not
collapse `r(varlist)' , by(record_id)
更谨慎的方法是先检查假设!
我参加了一个项目,其中数据集将患者信息和疾病观察分开,但两个观察具有相同的 ID。示例:
ID:12345 患者年龄:23,患者体重:55,患者身高:180
ID:12345 疾病进展:A,疾病类型:abc,疾病风险:50
每位患者都会如此。
现在我想运行一些关于这个和那个的统计数据,据我所知,我必须在一次观察中整合这些信息,所以我们在一次观察中有病人特征和他们的疾病特征.最好的方法是什么?请记住,这两个观察结果具有互斥变量,因此可以以某种方式简单地将它们“分组”。
* Example generated by -dataex-. To install: ssc install dataex
clear
input str32 record_id byte(treatment gender pcos aneurysmal_finding sz_anu_a_basilaris sz_anu_a_basilaris_2 sz_anu_a_basilaris_4)
"8a36ac06e58a541430cd8b31df3aeef2" . 1 0 2 . . .
"8a36ac06e58a541430cd8b31df3aeef2" . . . . . . .
"2afc1f12901992a1f973cde814615349" . 1 0 2 . . .
"2afc1f12901992a1f973cde814615349" . . . . . . .
"1e00442745c25082a64197b96065f755" . 1 0 2 . . .
"1e00442745c25082a64197b96065f755" . . . . . . .
"c90aef04e29f38fc3e21b919d5106ce8" . 1 0 2 . . .
"c90aef04e29f38fc3e21b919d5106ce8" . . . . . . .
"7cac71f3d31c7e9ec26e6a885ad554ab" . 2 0 2 . . .
"7cac71f3d31c7e9ec26e6a885ad554ab" . . . . . . .
"53c1f08aff25ace9afc46aca3263e7ca" . 1 0 2 . . .
"53c1f08aff25ace9afc46aca3263e7ca" . . . . . . .
"cdbf4328e0724f30950e437bc6bbe262" . 2 0 2 . . .
"cdbf4328e0724f30950e437bc6bbe262" . . . . . . .
"50d722dca92aee72c39c846066850a22" 1 2 0 2 . . .
"50d722dca92aee72c39c846066850a22" . . . . . . .
"ffe78f8927a81a5521f098aa077a755f" . 1 0 1 . . .
"ffe78f8927a81a5521f098aa077a755f" . . . . . . .
"aa2309be5c9b76012462fce3f43a8249" . 1 0 1 . . .
"aa2309be5c9b76012462fce3f43a8249" . . . . . . .
"4917b3d300e195b895e573474be6ccb6" . 1 0 2 . . .
"4917b3d300e195b895e573474be6ccb6" . . . . . . .
"b88557884343831060297ff4b67aeb36" . 1 . 2 . . .
"b88557884343831060297ff4b67aeb36" . . . . . . .
"ebe8ab86719aa71b68d7f0df3e451ce5" . . . 2 . . .
"8dd5267472002c796ce621984f9024ed" . . . . . . 3
"0b3e110c9765e14a5c41fadcc3cfc300" . . . . . . .
"8f58545ef8d37f290d26881743137a72" 1 2 0 2 . . .
"8f58545ef8d37f290d26881743137a72" . . . . . . .
"dcb6a27d1d4f5f1228860a76fa29e5ba" . 1 0 2 . . .
"dcb6a27d1d4f5f1228860a76fa29e5ba" . . . . . . .
"baedce78f2e736fe4d54dbdbe0460694" . 2 0 2 . . .
"baedce78f2e736fe4d54dbdbe0460694" . . . . . . .
"bb1db3b0eca9652cff3c76060b06d60b" 1 2 0 2 . . .
"bb1db3b0eca9652cff3c76060b06d60b" . . . . . . .
"6741bd218feba9de630dfe409a4e50ee" 1 2 0 2 . . .
"6741bd218feba9de630dfe409a4e50ee" . . . . . . .
"1e1425d670466e1a2c6c752d9227df17" . 2 0 2 . . .
"1e1425d670466e1a2c6c752d9227df17" . . . . . . .
"4c6672a06addc8e01842d2741be1857d" . 1 0 2 . . .
"4c6672a06addc8e01842d2741be1857d" . . . . . . .
"f1be80fbb7e4e1f5582780e25bfc8a2c" . 2 0 2 . . .
"f1be80fbb7e4e1f5582780e25bfc8a2c" . . . . . . .
"9991ec586e5f510e161fcad93fb1d79f" . 1 0 2 . . .
"9991ec586e5f510e161fcad93fb1d79f" . . . . . . .
"5c1eb56eccf9cf67ae6065f82b6eb6ce" . 1 0 2 . . .
"5c1eb56eccf9cf67ae6065f82b6eb6ce" . . . . . . .
"f9d10d2eb1951fa2ebc8b0509bb25593" . 1 0 2 . . .
"f9d10d2eb1951fa2ebc8b0509bb25593" . . . . . . .
"fbdf663512805caffe7a99d14fc9561f" . 2 0 2 . . .
"fbdf663512805caffe7a99d14fc9561f" . . . . . . .
"3b55aebe1b4b22e0c77168acc4b775dd" . 1 0 2 . . .
"3b55aebe1b4b22e0c77168acc4b775dd" . . . . . . .
"5f28194ddef4f9d057db2e4fcb7b5cf0" . 1 0 2 . . .
"5f28194ddef4f9d057db2e4fcb7b5cf0" . . . . . . .
"0b8d8253a8415275dbc2619e039985bb" . 1 0 2 5 . .
"0b8d8253a8415275dbc2619e039985bb" . . . . . . .
"4fb152c8524750b65b6717282cceb805" . 1 0 2 . . .
"4fb152c8524750b65b6717282cceb805" . . . . . . .
"ff5136e64c2110c355debca6acb74a13" . 1 0 2 . . .
"ff5136e64c2110c355debca6acb74a13" . . . . . . .
"29534fe6f18b75090b9d18f853ed7ec1" . 1 0 2 5 5 .
"29534fe6f18b75090b9d18f853ed7ec1" . . . . . . .
"8c334d2225db0661b25cf5f2c65fbcb9" . 1 0 2 . . .
"8c334d2225db0661b25cf5f2c65fbcb9" . . . . . . .
"68cf4b9f2db11cb9cf44fd0e03c53f16" . 2 . 2 . . .
"68cf4b9f2db11cb9cf44fd0e03c53f16" . . . . . . .
"6a44e65e7b1f33a3603acf2532bb40f9" . 1 0 2 . . .
"6a44e65e7b1f33a3603acf2532bb40f9" . . . . . . .
"2ed013748bf88df47c39d83bd48d8040" . 1 0 2 . . .
"2ed013748bf88df47c39d83bd48d8040" . . . . . . .
"c2f32f5b61b97d658f7b042b49b8da96" . 1 0 2 . . .
"c2f32f5b61b97d658f7b042b49b8da96" . . . . . . .
"58e1e0b5c29dee7d3739ec582d62b84c" . . . . . . .
"58e1e0b5c29dee7d3739ec582d62b84c" . . . . . . .
"8635b098d70b200fe8eef5dbf7c1c156" . 2 0 2 . . .
"8635b098d70b200fe8eef5dbf7c1c156" . . . . . . .
"266f1f1517fb50bafca92fff39c259d5" . 1 0 2 . . .
"266f1f1517fb50bafca92fff39c259d5" . . . . . . .
"d3df754a7322c02ed89f1208977a19ae" 1 2 0 2 5 . .
"d3df754a7322c02ed89f1208977a19ae" . . . . . . .
"46598c5d2da10731582d6342944e9337" . 1 0 2 . . .
"46598c5d2da10731582d6342944e9337" . . . . . . .
"8c2c5aa9b02eb1092b34cf38c2b1c83d" . 1 0 2 2 . .
"8c2c5aa9b02eb1092b34cf38c2b1c83d" . . . . . . .
"797c9cf7caf53f514f0154f34895fa80" . 1 0 2 . . .
"797c9cf7caf53f514f0154f34895fa80" . . . . . . .
"9b28a68095c520edcb56bee8aa5737b6" . 1 0 2 . . .
"9b28a68095c520edcb56bee8aa5737b6" . . . . . . .
"09e03748da35e9d799dc5d8ddf1909b5" . 1 0 2 . . .
"09e03748da35e9d799dc5d8ddf1909b5" . . . . . . .
"75d5574d8804d24932e3d0d9cbfa4b11" . 1 0 2 . . .
"75d5574d8804d24932e3d0d9cbfa4b11" . . . . . . .
"b5bda504efd4bd3b3be68513ccbf99ef" . 1 0 2 . . .
"b5bda504efd4bd3b3be68513ccbf99ef" . . . . . . .
"dc289c2a5a31355521dde31c4abd4c83" . 1 0 2 . . .
"dc289c2a5a31355521dde31c4abd4c83" . . . . . . .
"76ce83dbd64f05556e903deb54959d22" . 1 0 2 . . .
"76ce83dbd64f05556e903deb54959d22" . . . . . . .
"830ee6dd656938201f4a712607739768" . 1 0 2 . . .
end
label values treatment treatment_
label def treatment_ 1 "I blodfortyndende behandling", modify
label values gender gender_
label def gender_ 1 "Kvinde", modify
label def gender_ 2 "Mand", modify
label values pcos pcos_
label def pcos_ 0 "Nej", modify
label values aneurysmal_finding aneurysmal_finding_
label def aneurysmal_finding_ 1 "Screening", modify
label def aneurysmal_finding_ 2 "Tilfældig fund", modify
label values sz_anu_a_basilaris sz_anu_a_basilaris_
label def sz_anu_a_basilaris_ 2 "7-12 mm", modify
label def sz_anu_a_basilaris_ 5 "Uoplyst", modify
label values sz_anu_a_basilaris_2 sz_anu_a_basilaris_2_
label def sz_anu_a_basilaris_2_ 5 "Uoplyst", modify
label values sz_anu_a_basilaris_4 sz_anu_a_basilaris_4_
label def sz_anu_a_basilaris_4_ 3 "13-25 mm", modify
基本上我的问题可以使用(假设此语法正确)来解决:
gen obs1 = .
replace every variableValue in obs1 == variableValue in 12345
然后将其迭代 1000 次观察..
假设您已经加载了两个数据集。 第一个数据集称为数据集 1。 第二个数据集称为数据集 2。 而您的 ID 属性称为 id
然后你可以在id上调用merge它们。
use dataset1, clear
merge id using dataset2
假设每个标识符最多由 2 个观察值表示,并且每个其他变量都是数字并且由最多 1 个观察值中的非缺失值表示,您可以 collapse
。默认的 collapse
方法应该可以正常工作。
ds record_id, not
collapse `r(varlist)' , by(record_id)
更谨慎的方法是先检查假设!