按条件连接字符串

Concatenate string by a condition

我想根据与 user_id 关联的其他行的条件为新列分配一个值。

例如当 device 从桌面更改为移动时,然后将 桌面 > 移动 分配给该 user_id 的所有记录。 当有两个以上的明显变化时,例如从 tablet 到桌面和从桌面到移动,则 tablet > 桌面 > 移动

示例数据如下:

+---------+-------+---------+
| user_id | step  | device  |
+---------+-------+---------+
| 7bc6de  | step1 | desktop |
| 7bc6de  | step2 | desktop |
| 7bc6de  | step3 | mobile  |
| 7bc6de  | step4 | mobile  |
| 7bc6de  | step5 | desktop |
| 0ee6df  | step1 | tablet  |
| 0ee6df  | step2 | tablet  |
| 0ee6df  | step3 | desktop |
| 0ee6df  | step4 | desktop |
| 0ee6df  | step5 | mobile  |
+---------+-------+---------+

期望的输出:

+---------+-------+---------+---------------------------+
| user_id | step  | device  |    device_concatenated    |
+---------+-------+---------+---------------------------+
| 7bc6de  | step1 | desktop | desktop > mobile          |
| 7bc6de  | step2 | desktop | desktop > mobile          |
| 7bc6de  | step3 | mobile  | desktop > mobile          |
| 7bc6de  | step4 | mobile  | desktop > mobile          |
| 7bc6de  | step5 | desktop | desktop > mobile          |
| 0ee6df  | step1 | tablet  | tablet > desktop > mobile |
| 0ee6df  | step2 | tablet  | tablet > desktop > mobile |
| 0ee6df  | step3 | desktop | tablet > desktop > mobile |
| 0ee6df  | step4 | desktop | tablet > desktop > mobile |
| 0ee6df  | step5 | mobile  | tablet > desktop > mobile |
+---------+-------+---------+---------------------------+

附加场景:

在 table 中,存在重复的步骤,即用户可以在不同的时间使用不同的设备看到相同的步骤。在这种情况下,如何在预期结果中进行如下更改,让每个用户和设备迈出第一步?

+---------+-------+---------------------+---------+
| user_id | step  |     created_at      | device  |
+---------+-------+---------------------+---------+
| user1   | step1 | 2021-03-16 14:03:16 | mobile  |
| user1   | step2 | 2021-03-16 14:04:07 | mobile  |
| user1   | step2 | 2021-03-16 14:03:47 | desktop |
| user1   | step3 | 2021-03-16 14:03:55 | mobile  |
| user1   | step3 | 2021-03-16 14:04:00 | mobile  |
| user1   | step1 | 2021-03-16 14:04:02 | desktop |
| user1   | step2 | 2021-03-16 14:03:16 | mobile  |
| user1   | step3 | 2021-03-16 14:04:07 | mobile  |
| user1   | step4 | 2021-03-16 14:04:08 | desktop |
| user1   | step4 | 2021-03-16 14:04:09 | tablet  |
+---------+-------+---------------------+---------+

预期结果:

+---------+-------+---------------------+---------+---------------------+
| user_id | step  |     created_at      | device  | device_concatenated |
+---------+-------+---------------------+---------+---------------------+
| user1   | step1 | 2021-03-16 14:03:16 | mobile  | mobile > desktop    |
| user1   | step2 | 2021-03-16 14:03:16 | mobile  | mobile > desktop    |
| user1   | step3 | 2021-03-16 14:03:55 | mobile  | mobile > desktop    |
| user1   | step4 | 2021-03-16 14:04:08 | desktop | mobile > desktop    |
+---------+-------+---------------------+---------+---------------------+

https://www.db-fiddle.com/f/ooSmXAxqVHNxqD8sJ6wZfr/0

with first_seen_per_user_and_device AS (
select user_id, device, min(step) first_seen_step
from input_data
group by user_id, device
),
user_to_devices as(
SELECT user_id, array_to_string(
  array_agg(device order by first_seen_step), ' > ') device_concatenated
  from first_seen_per_user_and_device
group by 1
)
SELECT input_data.*, device_concatenated 
from input_data
join user_to_devices
  ON user_to_devices.user_id = input_data.user_id;

如果可以在多个设备上看到同一个用户和步骤,您需要添加一个额外的 WITH 子句以仅选择您想要的一个(例如,最早的一个),使用 SELECT DISTINCT:

https://www.db-fiddle.com/f/w9ZRvpQ7KXgdVKCTDAb43o/0

WITH input_data as (
  select distinct on (user_id, step) user_id, step, created_at, device
  from input_data_with_created_at
  ORDER BY user_id, step, created_at
), 
(...) -- Rest of the CTEs, same as before but with timestamp included.