点击流:UTM 路径的 PostgreSQL 交叉表 User_ID

Clickstream: PostgreSQL Crosstab for UTM Path by User_ID

我有一些点击流数据,我想对其进行分析以确定哪些付费广告系列促成了最多的转化。

我在数据库中有一个 table 具有以下内容:

user_id |   sent_at        |   campaign_name    |  last_click_attribution   
  101   | 2018-10-01 13:04 |   Google_Branded   |  Facebook_Focus
  101   | 2018-10-01 13:07 |   Google_Branded   |  Facebook_Focus 
  101   | 2018-10-02 13:09 |   Facebook_Focus   |  Facebook_Focus
  102   | 2018-09-25 13:04 |   Google_Focus     |  Google_Branded
  102   | 2018-09-27 09:24 |   Google_Branded   |  Google_Branded
  102   | 2018-10-01 11:25 |   Google_Branded   |  Google_Branded
  103   | 2018-09-27 13:04 |   Google_Branded   |  Google_Branded
  103   | 2018-09-28 09:15 |   Google_Branded   |  Google_Branded
  103   | 2018-09-29 18:34 |   Google_Branded   |  Google_Branded
  103   | 2018-09-30 21:02 |   Google_Branded   |  Google_Branded

活动名称是与他们为访问我们的网站而点击的广告相关联的活动。最后一次点击归因是他们在创建用户帐户之前最后点击的广告。

我想创建一个具有以下内容的 PostgreSQL 查询:

user_id |   last_click_attribution |   second_last_ad    |  third_last_ad  |....   
  101   | Facebook_Focus           |   Google_Branded    |  Google_Branded
  102   | Google_Branded           |   Google_Branded    |  Google Focus 
  103   | Google_Branded           |   Google_Branded    |  Google_Branded

我想有一种方法可以通过交叉表或连接两个视图来实现这一点,但我不确定如何实现。

感谢您的帮助!

如果您对分析有价值的点击流数据以及 SQL 查询示例有任何其他建议可供参考,我们也将不胜感激。

你可以尝试在子查询中使用make row number,然后使用condition aggregate function来make。

CREATE TABLE T(
   user_id int,
   sent_at timestamp,
   campaign_name varchar(50)
);


INSERT INTO T VALUES (101, '2018-10-01 13:04','Google_Branded');   
INSERT INTO T VALUES (101, '2018-10-01 13:07','Google_Branded');   
INSERT INTO T VALUES (101, '2018-10-02 13:09','Facebook_Focus');   
INSERT INTO T VALUES (102, '2018-09-25 13:04','Google_Focus');     
INSERT INTO T VALUES (102, '2018-09-27 09:24','Google_Branded');   
INSERT INTO T VALUES (102, '2018-10-01 11:25','Google_Branded');   
INSERT INTO T VALUES (103, '2018-09-27 13:04','Google_Branded');   
INSERT INTO T VALUES (103, '2018-09-28 09:15','Google_Branded');   
INSERT INTO T VALUES (103, '2018-09-29 18:34','Google_Branded');   
INSERT INTO T VALUES (103, '2018-09-30 21:02','Google_Branded');   

查询 1:

SELECT  user_id,
        MAX(CASE WHEN rn = 1 then campaign_name end) last_click_attribution,
        MAX(CASE WHEN rn = 2 then campaign_name end) second_last_ad,
        MAX(CASE WHEN rn = 3 then campaign_name end) third_last_ad,
        MAX(CASE WHEN rn = 4 then campaign_name end) fourth_last_ad
FROM (
  select *,row_number() over(partition by user_id ORDER by sent_at desc) rn
  from T
) t1
group by user_id

Results:

| user_id | last_click_attribution | second_last_ad |  third_last_ad | fourth_last_ad |
|---------|------------------------|----------------|----------------|----------------|
|     101 |         Facebook_Focus | Google_Branded | Google_Branded |         (null) |
|     102 |         Google_Branded | Google_Branded |   Google_Focus |         (null) |
|     103 |         Google_Branded | Google_Branded | Google_Branded | Google_Branded |