过程 SQL SAS 编程

PROC SQL SAS PROGRAMMING

我有以下数据集:

Name  Address         Bank_Account  Ph_NO    IP_Address   Chargeoff
AJ    12 ABC Street     1234        369      12.12.34         0
CK    12 ABC Street     1234        450      12.12.34         1
DN    15 JMP Street     3431        569      13.8.09          1
MO    39 link street    8421        450      05.67.89         1
LN    12 ABC Street     1234        340      14.75.06         1
ST    15 JMP Street     8421        569      13.8.09          0`

使用这个数据集,我想在 SAS 中创建以下视图:

 Name   CountOFAddr CountBankacct CountofPhone CountOfIP CountCharegeoff
  AJ       3             3           1            2             2
  CK       3             3           2            2             3
  DN       2             1           2            2             1
  MO       1             2           2            1             2
  LN       3             3           1            1             2
  ST       2             2           2            2             2

输出变量表示如下:

-CountOfAddr : For AJ countOFAddr is 3 which means that AJ Shares its address with itself, CK and LN

-CountBankAcct : For MO count of BankAcct is 2 which means that MO Shares its bank account number with itself and ST.Similarly for variables CountofPhone and CountOfIP.

-CountChargeoff: This one is a little tricky it basically implies that AJ is Linked to CK And LN through address...and both CK and LN have been charged off so the countChargeoff for AJ is 2.

对于 CKcountChargeOff3,因为它与自身链接,MO 通过银行帐户,LN/AJ 通过街道地址。 .so CK's 网络中的总 chargeoff3(CO 计数 AJ+CO 计数 CK+CO 计数 MO+CO 计数 LN)

我目前在一家金融服务公司担任风险分析师,这个问题的代码可能会帮助我们显着减少欺诈账户的资金。

谢谢。

SQL Fiddle Demo

SELECT 
    Name,
    (SELECT Count(Address)
     FROM dataset d2
     WHERE d1.Address = d2.Address 
    ) CountOFAddr,

    (SELECT Count(Bank_Account)
     FROM dataset d2
     WHERE d1.Bank_Account = d2.Bank_Account
    ) CountBankacct,

    (SELECT Count(Ph_NO)
     FROM dataset d2
     WHERE d1.Ph_NO = d2.Ph_NO
    ) CountofPhone,

    (SELECT Count(IP_Address)
     FROM dataset d2
     WHERE d1.IP_Address = d2.IP_Address
    ) CountOfIP, 

    (SELECT count(d2.Chargeoff) 
     FROM dataset d2
     WHERE  d1.name <> d2.name
       and (   d1.Address = d2.Address
            or d1.Bank_Account = d2.Bank_Account
            or d1.Ph_NO = d2.Ph_NO 
            or d1.IP_Address = d2.IP_Address
           )
    ) CountCharegeoff           
FROM dataset d1

我包括冲销计算。

将所有 d2 <> d1.name 中有任何共同字段的地方。那就数吧。