SAS PROC SQL - 两个字符变量的左连接问题
SAS PROC SQL - issue with Left join on two character variables
DataSet: Test1
Name Type Length Format Informat
RowID Numeric 8 6. 6.
COL2 Character 6 . .
COL3 Numeric 8 NUMERIC12. NUMERIC12.
DATE Numeric 8 11. 11.
TIME Character 8 $CHAR8. $CHAR8.
Amount Numeric 8
DataSet: Test2
Name Type Length Format Informat
RowID Numeric 8 9. 9.
COL2 Character 32 . .
COL3 Date 8 DATETIME27.6 DATETIME27.6
COL4 Character 17 . .
TIME Character 8 $CHAR8. $CHAR8.
COL5 Numeric 8 NUMERIC12. NUMERIC12.
AMOUNT Numeric 8
样本数据
Test1
RowID COL2 COL3 DATE TIME AMOUNT
3330 123456 123 20110523 14.14.50 2.00
3330 334567 123 20110523 19.13.34 2.00
3330 889789 123 20110523 20.01.11 2.00
3330 45678 1643 20110523 06.53.05 6.00
TEST2
RowID COL2 COL3 COL4 TIME COL5 Amount
3330 0010181002233611096xyBC3TLnDkVB7 23MAY2011:19:14:50.000000 20110523 14:14:50 14:14:50 123 2.00
3330 0010181005005029491mnopqrbT2cySA 24MAY2011:00:13:34.000000 20110523 19:13:34 19:13:34 123 2.00
3330 001018100222213220332ghijkl63BR1 23MAY2011:11:53:05.000000 20110523 06:53:05 06:53:05 1643 6.00
3330 00101810021738472682abcdef7vUcte 23MAY2011:13:30:03.000000 20110523 08:30:03 06:53:05 5575 1.00
我想在列 Test1.COL3=Test2.COL5 和 Test1.Time=Test2.TIME 上左加入 Test1 和 Test2,所需的输出看起来像
Final
RowID COL2 COL3 DATE TIME AMOUNT RowID COL2 COL3 COL4 TIME COL5 Amount
3330 123456 123 20110523 14.14.50 2.00 3330 0010181002233611096xyBC3TLnDkVB7 23MAY2011:19:14:50.000000 20110523 14:14:50 14:14:50 123 2.00
3330 334567 123 20110523 19.13.34 2.00 3330 0010181005005029491mnopqrbT2cySA 24MAY2011:00:13:34.000000 20110523 19:13:34 19:13:34 123 2.00
3330 889789 123 20110523 20.01.11 2.00 . . . . . . .
3330 45678 1643 20110523 06.53.05 6.00 3330 001018100222213220332ghijkl63BR1 23MAY2011:11:53:05.000000 20110523 06:53:05 06:53:05 1643 6.00
我是 运行 下面的代码是 SAS
proc sql;
create table final as
select * from
(
(select * from Test1) A
left join
(select * from Test2) B
on Test1.COL3=Test2.COL5 and Test1.Time=Test2.TIME
)
quit;
即使两个数据集中的 TIME 列长度相同,Format 和 Informat
,我也没有得到所需的输出
我得到的结果是
Final
RowID COL2 COL3 DATE TIME AMOUNT RowID COL2 COL3 COL4 TIME COL5 Amount
3330 123456 123 20110523 14.14.50 2.00 . . . . . . .
3330 334567 123 20110523 19.13.34 2.00 . . . . . . .
3330 889789 123 20110523 20.01.11 2.00 . . . . . . .
3330 45678 1643 20110523 06.53.05 6.00 . . . . . . .
I do not understand what is wrong.
14.14.50
没有 14:14:50
修正您的格式,或使用 INPUT 将它们都设为数字。
尝试使用
PROC sql;
CREATE TABLE final AS
SELECT *
FROM (
(
SELECT *
FROM test1) A
LEFT JOIN
(
SELECT *
FROM test2) B
ON A.col3=B.col5
AND Replace(A.time,':','.')=Replace(B.time,':','.') )
quit;
DataSet: Test1
Name Type Length Format Informat
RowID Numeric 8 6. 6.
COL2 Character 6 . .
COL3 Numeric 8 NUMERIC12. NUMERIC12.
DATE Numeric 8 11. 11.
TIME Character 8 $CHAR8. $CHAR8.
Amount Numeric 8
DataSet: Test2
Name Type Length Format Informat
RowID Numeric 8 9. 9.
COL2 Character 32 . .
COL3 Date 8 DATETIME27.6 DATETIME27.6
COL4 Character 17 . .
TIME Character 8 $CHAR8. $CHAR8.
COL5 Numeric 8 NUMERIC12. NUMERIC12.
AMOUNT Numeric 8
样本数据
Test1
RowID COL2 COL3 DATE TIME AMOUNT
3330 123456 123 20110523 14.14.50 2.00
3330 334567 123 20110523 19.13.34 2.00
3330 889789 123 20110523 20.01.11 2.00
3330 45678 1643 20110523 06.53.05 6.00
TEST2
RowID COL2 COL3 COL4 TIME COL5 Amount
3330 0010181002233611096xyBC3TLnDkVB7 23MAY2011:19:14:50.000000 20110523 14:14:50 14:14:50 123 2.00
3330 0010181005005029491mnopqrbT2cySA 24MAY2011:00:13:34.000000 20110523 19:13:34 19:13:34 123 2.00
3330 001018100222213220332ghijkl63BR1 23MAY2011:11:53:05.000000 20110523 06:53:05 06:53:05 1643 6.00
3330 00101810021738472682abcdef7vUcte 23MAY2011:13:30:03.000000 20110523 08:30:03 06:53:05 5575 1.00
我想在列 Test1.COL3=Test2.COL5 和 Test1.Time=Test2.TIME 上左加入 Test1 和 Test2,所需的输出看起来像
Final
RowID COL2 COL3 DATE TIME AMOUNT RowID COL2 COL3 COL4 TIME COL5 Amount
3330 123456 123 20110523 14.14.50 2.00 3330 0010181002233611096xyBC3TLnDkVB7 23MAY2011:19:14:50.000000 20110523 14:14:50 14:14:50 123 2.00
3330 334567 123 20110523 19.13.34 2.00 3330 0010181005005029491mnopqrbT2cySA 24MAY2011:00:13:34.000000 20110523 19:13:34 19:13:34 123 2.00
3330 889789 123 20110523 20.01.11 2.00 . . . . . . .
3330 45678 1643 20110523 06.53.05 6.00 3330 001018100222213220332ghijkl63BR1 23MAY2011:11:53:05.000000 20110523 06:53:05 06:53:05 1643 6.00
我是 运行 下面的代码是 SAS
proc sql;
create table final as
select * from
(
(select * from Test1) A
left join
(select * from Test2) B
on Test1.COL3=Test2.COL5 and Test1.Time=Test2.TIME
)
quit;
即使两个数据集中的 TIME 列长度相同,Format 和 Informat
,我也没有得到所需的输出我得到的结果是
Final
RowID COL2 COL3 DATE TIME AMOUNT RowID COL2 COL3 COL4 TIME COL5 Amount
3330 123456 123 20110523 14.14.50 2.00 . . . . . . .
3330 334567 123 20110523 19.13.34 2.00 . . . . . . .
3330 889789 123 20110523 20.01.11 2.00 . . . . . . .
3330 45678 1643 20110523 06.53.05 6.00 . . . . . . .
I do not understand what is wrong.
14.14.50
没有 14:14:50
修正您的格式,或使用 INPUT 将它们都设为数字。
尝试使用
PROC sql;
CREATE TABLE final AS
SELECT *
FROM (
(
SELECT *
FROM test1) A
LEFT JOIN
(
SELECT *
FROM test2) B
ON A.col3=B.col5
AND Replace(A.time,':','.')=Replace(B.time,':','.') )
quit;