过程 SQL SAS 编程
PROC SQL SAS PROGRAMMING
我有以下数据集:
Name Address Bank_Account Ph_NO IP_Address Chargeoff
AJ 12 ABC Street 1234 369 12.12.34 0
CK 12 ABC Street 1234 450 12.12.34 1
DN 15 JMP Street 3431 569 13.8.09 1
MO 39 link street 8421 450 05.67.89 1
LN 12 ABC Street 1234 340 14.75.06 1
ST 15 JMP Street 8421 569 13.8.09 0`
使用这个数据集,我想在 SAS 中创建以下视图:
Name CountOFAddr CountBankacct CountofPhone CountOfIP CountCharegeoff
AJ 3 3 1 2 2
CK 3 3 2 2 3
DN 2 1 2 2 1
MO 1 2 2 1 2
LN 3 3 1 1 2
ST 2 2 2 2 2
输出变量表示如下:
-CountOfAddr : For AJ countOFAddr is 3 which means that AJ Shares its address with itself, CK and LN
-CountBankAcct : For MO count of BankAcct is 2 which means that MO Shares its bank account number with itself and ST.Similarly for variables CountofPhone and CountOfIP.
-CountChargeoff: This one is a little tricky it basically implies that AJ is Linked to CK And LN through address...and both CK and LN have been charged off so the countChargeoff for AJ is 2.
对于 CK
,countChargeOff
是 3
,因为它与自身链接,MO
通过银行帐户,LN/AJ
通过街道地址。 .so CK's
网络中的总 chargeoff
是 3
(CO 计数 AJ+CO
计数 CK+CO
计数 MO+CO
计数 LN
)
我目前在一家金融服务公司担任风险分析师,这个问题的代码可能会帮助我们显着减少欺诈账户的资金。
谢谢。
SELECT
Name,
(SELECT Count(Address)
FROM dataset d2
WHERE d1.Address = d2.Address
) CountOFAddr,
(SELECT Count(Bank_Account)
FROM dataset d2
WHERE d1.Bank_Account = d2.Bank_Account
) CountBankacct,
(SELECT Count(Ph_NO)
FROM dataset d2
WHERE d1.Ph_NO = d2.Ph_NO
) CountofPhone,
(SELECT Count(IP_Address)
FROM dataset d2
WHERE d1.IP_Address = d2.IP_Address
) CountOfIP,
(SELECT count(d2.Chargeoff)
FROM dataset d2
WHERE d1.name <> d2.name
and ( d1.Address = d2.Address
or d1.Bank_Account = d2.Bank_Account
or d1.Ph_NO = d2.Ph_NO
or d1.IP_Address = d2.IP_Address
)
) CountCharegeoff
FROM dataset d1
我包括冲销计算。
将所有 d2 <> d1.name
中有任何共同字段的地方。那就数吧。
我有以下数据集:
Name Address Bank_Account Ph_NO IP_Address Chargeoff
AJ 12 ABC Street 1234 369 12.12.34 0
CK 12 ABC Street 1234 450 12.12.34 1
DN 15 JMP Street 3431 569 13.8.09 1
MO 39 link street 8421 450 05.67.89 1
LN 12 ABC Street 1234 340 14.75.06 1
ST 15 JMP Street 8421 569 13.8.09 0`
使用这个数据集,我想在 SAS 中创建以下视图:
Name CountOFAddr CountBankacct CountofPhone CountOfIP CountCharegeoff
AJ 3 3 1 2 2
CK 3 3 2 2 3
DN 2 1 2 2 1
MO 1 2 2 1 2
LN 3 3 1 1 2
ST 2 2 2 2 2
输出变量表示如下:
-CountOfAddr : For AJ countOFAddr is 3 which means that AJ Shares its address with itself, CK and LN
-CountBankAcct : For MO count of BankAcct is 2 which means that MO Shares its bank account number with itself and ST.Similarly for variables CountofPhone and CountOfIP.
-CountChargeoff: This one is a little tricky it basically implies that AJ is Linked to CK And LN through address...and both CK and LN have been charged off so the countChargeoff for AJ is 2.
对于 CK
,countChargeOff
是 3
,因为它与自身链接,MO
通过银行帐户,LN/AJ
通过街道地址。 .so CK's
网络中的总 chargeoff
是 3
(CO 计数 AJ+CO
计数 CK+CO
计数 MO+CO
计数 LN
)
我目前在一家金融服务公司担任风险分析师,这个问题的代码可能会帮助我们显着减少欺诈账户的资金。
谢谢。
SELECT
Name,
(SELECT Count(Address)
FROM dataset d2
WHERE d1.Address = d2.Address
) CountOFAddr,
(SELECT Count(Bank_Account)
FROM dataset d2
WHERE d1.Bank_Account = d2.Bank_Account
) CountBankacct,
(SELECT Count(Ph_NO)
FROM dataset d2
WHERE d1.Ph_NO = d2.Ph_NO
) CountofPhone,
(SELECT Count(IP_Address)
FROM dataset d2
WHERE d1.IP_Address = d2.IP_Address
) CountOfIP,
(SELECT count(d2.Chargeoff)
FROM dataset d2
WHERE d1.name <> d2.name
and ( d1.Address = d2.Address
or d1.Bank_Account = d2.Bank_Account
or d1.Ph_NO = d2.Ph_NO
or d1.IP_Address = d2.IP_Address
)
) CountCharegeoff
FROM dataset d1
我包括冲销计算。
将所有 d2 <> d1.name
中有任何共同字段的地方。那就数吧。