SAS在设定时间段内基于多层统计发生次数
SAS Counting Occurrences based on multiple layers within set time period
我正在尝试计算同一个人在每次实例发生后的 30 天内在同一地点为同一件商品支付四次或更多次费用的次数。例如,输入看起来像:
person service place date
A x shop1 01/01/15
A x shop1 01/15/15
A x shop1 01/20/15
B y shop2 03/20/15
B y shop2 04/01/15
C z shop1 05/05/15
输出看起来像这样:
person service place date count
A x shop1 01/01/15 3
A x shop1 01/15/15 3
A x shop1 01/20/15 3
B y shop2 03/20/15 2
B y shop2 04/01/15 2
C z shop1 05/05/15 1
我尝试过类似的东西:
data work.want;
do _n_ =1 by 1 until (last.PLACE);
set work.rawdata;
by PERSON PLACE;
if first.PLACE then count=0;
count+1;
end;
frequency= count;
do _n_ = 1 by 1 until (last.PLACE);
set work.rawdata;
by PERSON PLACE;
output;
end;
run;
这给出了基于人和地点的计数,但不考虑时间。任何帮助或建议将不胜感激!谢谢
proc sql;
create table summary as
select person, service, place, count(*) as count
from rawdata
group by person, service, place
having count>=4;
quit;
注意:这不会检查事件是否发生在彼此的 30 天内。我不知道你数据集中的数据类型。
这可以通过 proc sql...
轻松完成
您的数据:
data have;
input person $ service $ place $;
datalines;
A x shop1
A x shop1
A x shop1
B y shop2
B y shop2
C z shop1
;
run;
然后我们统计每个1,2组"place"的出现次数,并加入原来的table.
proc sql;
create table want as
select a.*, b._count
from have as a
inner join
(
select person, service, count(place) as _count
from have
group by 1,2
) as b
on a.person = b.person
and a.service = b.service
;
quit;
是否有日期字段?例如,我们需要它以便按月(或 30 天)对数据进行分组。
我正在尝试计算同一个人在每次实例发生后的 30 天内在同一地点为同一件商品支付四次或更多次费用的次数。例如,输入看起来像:
person service place date
A x shop1 01/01/15
A x shop1 01/15/15
A x shop1 01/20/15
B y shop2 03/20/15
B y shop2 04/01/15
C z shop1 05/05/15
输出看起来像这样:
person service place date count
A x shop1 01/01/15 3
A x shop1 01/15/15 3
A x shop1 01/20/15 3
B y shop2 03/20/15 2
B y shop2 04/01/15 2
C z shop1 05/05/15 1
我尝试过类似的东西:
data work.want;
do _n_ =1 by 1 until (last.PLACE);
set work.rawdata;
by PERSON PLACE;
if first.PLACE then count=0;
count+1;
end;
frequency= count;
do _n_ = 1 by 1 until (last.PLACE);
set work.rawdata;
by PERSON PLACE;
output;
end;
run;
这给出了基于人和地点的计数,但不考虑时间。任何帮助或建议将不胜感激!谢谢
proc sql;
create table summary as
select person, service, place, count(*) as count
from rawdata
group by person, service, place
having count>=4;
quit;
注意:这不会检查事件是否发生在彼此的 30 天内。我不知道你数据集中的数据类型。
这可以通过 proc sql...
轻松完成您的数据:
data have;
input person $ service $ place $;
datalines;
A x shop1
A x shop1
A x shop1
B y shop2
B y shop2
C z shop1
;
run;
然后我们统计每个1,2组"place"的出现次数,并加入原来的table.
proc sql;
create table want as
select a.*, b._count
from have as a
inner join
(
select person, service, count(place) as _count
from have
group by 1,2
) as b
on a.person = b.person
and a.service = b.service
;
quit;
是否有日期字段?例如,我们需要它以便按月(或 30 天)对数据进行分组。