以 SPSS 长格式折叠数据

Question

我有一个如下所示的数据集：

Year ID Sex Age_end_of_year Count Age_group
2008 1  1   0               2     1     
2008 1  1   1               1     1  
2008 1  1   2               6     1 
2008 1  2   0               2     1     
2008 1  2   1               5     1  
2008 1  2   2               6     1 
2008 2  1   0               5     1
2008 2  1   1               4     1
2008 2  1   2               7     1
2008 2  2   0               2     1
2008 2  2   1               3     1
2008 2  2   2               2     1
.
.
.
2016 99 1   45              20    3

Sex（1 或 2），Age_end_of_year 从 0-45，Count 可以是 0-100 的任何值。 Age_group 包含三个类别（0-15 岁、16-30 岁、31-45 岁）。该数据集包括超过 20 万个观测值。我想暂时将数据集保留为长格式，但我希望 ID 每年仅一次 return，如下所示：

Year ID Age_group_1_female Age_group_2_female Age_group_3_female Age_group_1_male ...
2008 1  8                  7                  9                  4   
2008 2  14                 3                  8                  2
2008 3  1                  2                  10                 1
2008 4  1                  14                 8                  9
.
.
.
2016 99 4                  2                  4                  9

换句话说，我想根据上述示例用单个变量替换 Sex、Age_group 和 Count 变量，同时删除 Age_end_of_year 变量。新变量应该按性别和 age_group 折叠计数数据。我玩过聚合，尝试使用向导进行转置和重组，但我无法让它工作。任何帮助表示赞赏！

Answer 1

casestovars 函数将根据您的需要进行重组，但为了改进生成的变量名称，我将首先为 sex.

创建一个字符串变量

string sexT (a10).
if sex=1 sexT="Male".
if sex=2 sexT="Female".

*now for the restructure.
sort cases by year id SexT Age_group.
casestovars /id=year id /index=SexT Age_group /drop=Age_end_of_year sex/sep="_".

以 SPSS 长格式折叠数据

Collapsing data in SPSS long format

spss