按间隔排列的滚动数 SAS EG
Rolling Numbers by interval SAS EG
我在 SAS 中有一个数据集:
OBS CAR DATE_TIME
1 HON JAN-01-17 13:00
2 HON JAN-01-17 13:04
3 HON JAN-01-17 13:06
4 HON JAN-01-17 13:15
5 HON JAN-01-17 13:20
6 HON JAN-01-17 13:29
7 TOY JAN-01-17 13:05
8 TOY JAN-01-17 13:10
9 TOY JAN-01-17 13:39
数据表示汽车类型的事件时间戳。我正在尝试计算特定汽车在任何 10 分钟间隔内发生的事件总数。目前,我通过添加另一行 10 分钟加上日期时间列然后将 table 与其自身连接来实现。这是代码。
PROC SQL; CREATE TABLE WANT AS
SELECT A.OBS,A.CAR,A.DATE_TIME,A.DATE_TIME+(10*60) AS ENDTM
COUNT(B.OBS) AS TOTAL
FROM HAVE A LEFT JOIN HAVE B ON A.CAR=B.CAR AND B.DATE_TIME BETWEEN A.DATE_TIME AND B.ENDTM
GROUP BY A.OBS,A.CAR;QUIT;
这是我得到的输出:
OBS CAR DATE_TIME TOT
1 HON JAN-01-17 13:00 3
2 HON JAN-01-17 13:04 2
3 HON JAN-01-17 13:06 2
4 HON JAN-01-17 13:15 2
5 HON JAN-01-17 13:20 2
6 HON JAN-01-17 13:29 1
7 TOY JAN-01-17 13:05 2
8 TOY JAN-01-17 13:10 1
9 TOY JAN-01-17 13:39 1
Is there a more efficient way to do it using Data step ?
Thanks
Jay
不是数据步骤,但 proc timeseries
会为您完成。只需将您的日期转换为日期时间并使用 minute10.
.
的间隔
data have;
input group$ date$ time$ tot;
month = scan(date, 1, '-');
day = scan(date, 2, '-');
year = scan(date, 3, '-');
datetime = input(cats(day, month, year, ':', time), datetime.);
format datetime datetime.;
datalines;
HON JAN-01-17 13:00 3
HON JAN-01-17 13:04 2
HON JAN-01-17 13:06 2
HON JAN-01-17 13:15 2
HON JAN-01-17 13:20 2
HON JAN-01-17 13:29 1
TOY JAN-01-17 13:05 2
TOY JAN-01-17 13:10 1
TOY JAN-01-17 13:39 1
;
run;
proc timeseries data=have out=want;
by group;
id datetime interval=minute10.;
var tot / accumulate=total;
run;
一个数据步骤选项是使用一个临时数组并将您看到的数据存储在其中,然后检查该数组的哪些元素仍然满足您的需要。我在这里按照与上面所示相反的方向进行操作(我在“10 分钟前”进行操作)但是您可以对数据进行反向排序并按照您需要的方向进行操作(但更改 intck
左右的比较).
data have;
input @1 OBS 1. @6 CAR . @12 DATE_TIME anydtdtm15.;
format date_time datetime17.;
datalines;
1 HON JAN-01-17 13:00
2 HON JAN-01-17 13:04
3 HON JAN-01-17 13:06
4 HON JAN-01-17 13:15
5 HON JAN-01-17 13:20
6 HON JAN-01-17 13:29
7 TOY JAN-01-17 13:05
8 TOY JAN-01-17 13:10
9 TOY JAN-01-17 13:39
;;;;
run;
data want;
set have;
by car date_time;
array prev_times[20] _temporary_;
tot = 1;
do _i = dim(prev_times) to 1 by -1 while (not missing(prev_times[_i]));
if intck('minute',prev_times[_i], date_time) le 10 then do;
tot = tot + 1;
end;
else do;
call missing(prev_times[_i]);
end;
end;
prev_times[_i] = date_time;
call sortn(of prev_times[*]);
output;
if last.car then call missing(of prev_times[*]);
run;
我在 SAS 中有一个数据集:
OBS CAR DATE_TIME
1 HON JAN-01-17 13:00
2 HON JAN-01-17 13:04
3 HON JAN-01-17 13:06
4 HON JAN-01-17 13:15
5 HON JAN-01-17 13:20
6 HON JAN-01-17 13:29
7 TOY JAN-01-17 13:05
8 TOY JAN-01-17 13:10
9 TOY JAN-01-17 13:39
数据表示汽车类型的事件时间戳。我正在尝试计算特定汽车在任何 10 分钟间隔内发生的事件总数。目前,我通过添加另一行 10 分钟加上日期时间列然后将 table 与其自身连接来实现。这是代码。
PROC SQL; CREATE TABLE WANT AS
SELECT A.OBS,A.CAR,A.DATE_TIME,A.DATE_TIME+(10*60) AS ENDTM
COUNT(B.OBS) AS TOTAL
FROM HAVE A LEFT JOIN HAVE B ON A.CAR=B.CAR AND B.DATE_TIME BETWEEN A.DATE_TIME AND B.ENDTM
GROUP BY A.OBS,A.CAR;QUIT;
这是我得到的输出:
OBS CAR DATE_TIME TOT
1 HON JAN-01-17 13:00 3
2 HON JAN-01-17 13:04 2
3 HON JAN-01-17 13:06 2
4 HON JAN-01-17 13:15 2
5 HON JAN-01-17 13:20 2
6 HON JAN-01-17 13:29 1
7 TOY JAN-01-17 13:05 2
8 TOY JAN-01-17 13:10 1
9 TOY JAN-01-17 13:39 1
Is there a more efficient way to do it using Data step ?
Thanks
Jay
不是数据步骤,但 proc timeseries
会为您完成。只需将您的日期转换为日期时间并使用 minute10.
.
data have;
input group$ date$ time$ tot;
month = scan(date, 1, '-');
day = scan(date, 2, '-');
year = scan(date, 3, '-');
datetime = input(cats(day, month, year, ':', time), datetime.);
format datetime datetime.;
datalines;
HON JAN-01-17 13:00 3
HON JAN-01-17 13:04 2
HON JAN-01-17 13:06 2
HON JAN-01-17 13:15 2
HON JAN-01-17 13:20 2
HON JAN-01-17 13:29 1
TOY JAN-01-17 13:05 2
TOY JAN-01-17 13:10 1
TOY JAN-01-17 13:39 1
;
run;
proc timeseries data=have out=want;
by group;
id datetime interval=minute10.;
var tot / accumulate=total;
run;
一个数据步骤选项是使用一个临时数组并将您看到的数据存储在其中,然后检查该数组的哪些元素仍然满足您的需要。我在这里按照与上面所示相反的方向进行操作(我在“10 分钟前”进行操作)但是您可以对数据进行反向排序并按照您需要的方向进行操作(但更改 intck
左右的比较).
data have;
input @1 OBS 1. @6 CAR . @12 DATE_TIME anydtdtm15.;
format date_time datetime17.;
datalines;
1 HON JAN-01-17 13:00
2 HON JAN-01-17 13:04
3 HON JAN-01-17 13:06
4 HON JAN-01-17 13:15
5 HON JAN-01-17 13:20
6 HON JAN-01-17 13:29
7 TOY JAN-01-17 13:05
8 TOY JAN-01-17 13:10
9 TOY JAN-01-17 13:39
;;;;
run;
data want;
set have;
by car date_time;
array prev_times[20] _temporary_;
tot = 1;
do _i = dim(prev_times) to 1 by -1 while (not missing(prev_times[_i]));
if intck('minute',prev_times[_i], date_time) le 10 then do;
tot = tot + 1;
end;
else do;
call missing(prev_times[_i]);
end;
end;
prev_times[_i] = date_time;
call sortn(of prev_times[*]);
output;
if last.car then call missing(of prev_times[*]);
run;