确定 groups/networks 的客户

Identifying groups/networks of customers

我正在尝试创建由跨交易的客户互动决定的独特客户组。

以下是数据示例:

Transaction # Primary Customer Cosigner WANT: Customer Group
1 1 2 A
2 1 3 A
3 1 4 A
4 1 2 A
5 2 5 A
6 3 6 A
7 2 1 A
8 3 1 A
9 7 8 B
10 9 C

在此示例中,客户 1 直接或间接地与客户 2-6 相关联,因此与客户 1-6 关联的所有交易都属于“A”组。客户 7 和 8 直接相连,将被标记为“B”组。客户 9 没有联系,是“C”组的唯一成员。

如有任何建议,我们将不胜感激!

您的数据可以被认为是来自 SAS Communities answer here on SO 的 graph. So your request is to find the connected subgraphs of that graph. That question has an answer on Whosebug and SAS Communities. But this question is more on topic than that older SO question. So let's post the subnet SAS macro 的边缘,在那里更容易找到。

这个简单的宏使用重复的 PROC SQL 查询来构建连接子图的列表,直到所有原始记录都已分配给子图。

宏设置为让您传入源数据集的名称和保存节点 ID 的两个变量的名称。

因此,首先让我们将您的打印输出转换为实际的 SAS 数据集。

data have;
  input id primary cosign want $;
cards;
1 1 2 A
2 1 3 A
3 1 4 A
4 1 2 A
5 2 5 A
6 3 6 A
7 2 1 A
8 3 1 A
9 7 8 B
10 9 . C
;

现在我们可以调用宏并告诉它 PRIMARY 和 COSIGN 是具有节点 ID 的变量,而 SUBNET 是新变量的名称,用于保存连接的子图的 ID。注意:此版本默认按指示处理图表。

%subnet(in=have,out=want,from=primary,to=cosign,subnet=subnet);

结果:

Obs    id    primary    cosign    want    subnet

  1     1       1          2        A        1
  2     2       1          3        A        1
  3     3       1          4        A        1
  4     4       1          2        A        1
  5     5       2          5        A        1
  6     6       3          6        A        1
  7     7       2          1        A        1
  8     8       3          1        A        1
  9     9       7          8        B        2
 10    10       9          .        C        3

这是 %SUBNET() 宏的代码。

%macro subnet(in=,out=,from=from,to=to,subnet=subnet,directed=1);
/*----------------------------------------------------------------------
SUBNET - Build connected subnets from pairs of nodes.
Input Table :FROM TO pairs of rows
Output Table:input data with &subnet added
Work Tables:
  NODES - List of all nodes in input.
  NEW - List of new nodes to assign to current subnet.

Algorithm:
Pick next unassigned node and grow the subnet by adding all connected
nodes. Repeat until all unassigned nodes are put into a subnet.

To treat the graph as undirected set the DIRECTED parameter to 0.
----------------------------------------------------------------------*/
%local subnetid next getnext ;
%*----------------------------------------------------------------------
Put code to get next unassigned node into a macro variable. This query 
is used in two places in the program.
-----------------------------------------------------------------------;
%let getnext= select node into :next from nodes where subnet=.;
%*----------------------------------------------------------------------
Initialize subnet id counter.
-----------------------------------------------------------------------;
%let subnetid=0;
proc sql noprint;
*----------------------------------------------------------------------;
* Get list of all nodes ;
*----------------------------------------------------------------------;
  create table nodes as
    select . as subnet, &from as node from &in where &from is not null
    union
    select . as subnet, &to as node from &in where &to is not null
  ;
*----------------------------------------------------------------------;
* Get next unassigned node ;
*----------------------------------------------------------------------;
  &getnext;
%do %while (&sqlobs) ;
*----------------------------------------------------------------------;
* Set subnet to next id ;
*----------------------------------------------------------------------;
  %let subnetid=%eval(&subnetid+1);
  update nodes set subnet=&subnetid where node=&next;
  %do %while (&sqlobs) ;
*----------------------------------------------------------------------;
* Get list of connected nodes for this subnet ;
*----------------------------------------------------------------------;
    create table new as
      select distinct a.&to as node
        from &in a, nodes b, nodes c
        where a.&from= b.node
          and a.&to= c.node
          and b.subnet = &subnetid
          and c.subnet = .
    ;
%if "&directed" ne "1" %then %do;
    insert into new 
      select distinct a.&from as node
        from &in a, nodes b, nodes c
        where a.&to= b.node
          and a.&from= c.node
          and b.subnet = &subnetid
          and c.subnet = .
    ;
%end;
*----------------------------------------------------------------------;
* Update subnet for these nodes ;
*----------------------------------------------------------------------;
    update nodes set subnet=&subnetid
      where node in (select node from new )
    ;
  %end;
*----------------------------------------------------------------------;
* Get next unassigned node ;
*----------------------------------------------------------------------;
  &getnext;
%end;
*----------------------------------------------------------------------;
* Create output dataset by adding subnet number. ;
*----------------------------------------------------------------------;
  create table &out as
    select distinct a.*,b.subnet as &subnet
      from &in a , nodes b
      where a.&from = b.node
  ;
quit;
%mend subnet ;