按条件连接字符串
Concatenate string by a condition
我想根据与 user_id
关联的其他行的条件为新列分配一个值。
例如当 device
从桌面更改为移动时,然后将 桌面 > 移动 分配给该 user_id
的所有记录。
当有两个以上的明显变化时,例如从 tablet 到桌面和从桌面到移动,则 tablet > 桌面 > 移动
示例数据如下:
+---------+-------+---------+
| user_id | step | device |
+---------+-------+---------+
| 7bc6de | step1 | desktop |
| 7bc6de | step2 | desktop |
| 7bc6de | step3 | mobile |
| 7bc6de | step4 | mobile |
| 7bc6de | step5 | desktop |
| 0ee6df | step1 | tablet |
| 0ee6df | step2 | tablet |
| 0ee6df | step3 | desktop |
| 0ee6df | step4 | desktop |
| 0ee6df | step5 | mobile |
+---------+-------+---------+
期望的输出:
+---------+-------+---------+---------------------------+
| user_id | step | device | device_concatenated |
+---------+-------+---------+---------------------------+
| 7bc6de | step1 | desktop | desktop > mobile |
| 7bc6de | step2 | desktop | desktop > mobile |
| 7bc6de | step3 | mobile | desktop > mobile |
| 7bc6de | step4 | mobile | desktop > mobile |
| 7bc6de | step5 | desktop | desktop > mobile |
| 0ee6df | step1 | tablet | tablet > desktop > mobile |
| 0ee6df | step2 | tablet | tablet > desktop > mobile |
| 0ee6df | step3 | desktop | tablet > desktop > mobile |
| 0ee6df | step4 | desktop | tablet > desktop > mobile |
| 0ee6df | step5 | mobile | tablet > desktop > mobile |
+---------+-------+---------+---------------------------+
附加场景:
在 table 中,存在重复的步骤,即用户可以在不同的时间使用不同的设备看到相同的步骤。在这种情况下,如何在预期结果中进行如下更改,让每个用户和设备迈出第一步?
+---------+-------+---------------------+---------+
| user_id | step | created_at | device |
+---------+-------+---------------------+---------+
| user1 | step1 | 2021-03-16 14:03:16 | mobile |
| user1 | step2 | 2021-03-16 14:04:07 | mobile |
| user1 | step2 | 2021-03-16 14:03:47 | desktop |
| user1 | step3 | 2021-03-16 14:03:55 | mobile |
| user1 | step3 | 2021-03-16 14:04:00 | mobile |
| user1 | step1 | 2021-03-16 14:04:02 | desktop |
| user1 | step2 | 2021-03-16 14:03:16 | mobile |
| user1 | step3 | 2021-03-16 14:04:07 | mobile |
| user1 | step4 | 2021-03-16 14:04:08 | desktop |
| user1 | step4 | 2021-03-16 14:04:09 | tablet |
+---------+-------+---------------------+---------+
预期结果:
+---------+-------+---------------------+---------+---------------------+
| user_id | step | created_at | device | device_concatenated |
+---------+-------+---------------------+---------+---------------------+
| user1 | step1 | 2021-03-16 14:03:16 | mobile | mobile > desktop |
| user1 | step2 | 2021-03-16 14:03:16 | mobile | mobile > desktop |
| user1 | step3 | 2021-03-16 14:03:55 | mobile | mobile > desktop |
| user1 | step4 | 2021-03-16 14:04:08 | desktop | mobile > desktop |
+---------+-------+---------------------+---------+---------------------+
https://www.db-fiddle.com/f/ooSmXAxqVHNxqD8sJ6wZfr/0
with first_seen_per_user_and_device AS (
select user_id, device, min(step) first_seen_step
from input_data
group by user_id, device
),
user_to_devices as(
SELECT user_id, array_to_string(
array_agg(device order by first_seen_step), ' > ') device_concatenated
from first_seen_per_user_and_device
group by 1
)
SELECT input_data.*, device_concatenated
from input_data
join user_to_devices
ON user_to_devices.user_id = input_data.user_id;
如果可以在多个设备上看到同一个用户和步骤,您需要添加一个额外的 WITH 子句以仅选择您想要的一个(例如,最早的一个),使用 SELECT DISTINCT
:
https://www.db-fiddle.com/f/w9ZRvpQ7KXgdVKCTDAb43o/0
WITH input_data as (
select distinct on (user_id, step) user_id, step, created_at, device
from input_data_with_created_at
ORDER BY user_id, step, created_at
),
(...) -- Rest of the CTEs, same as before but with timestamp included.
我想根据与 user_id
关联的其他行的条件为新列分配一个值。
例如当 device
从桌面更改为移动时,然后将 桌面 > 移动 分配给该 user_id
的所有记录。
当有两个以上的明显变化时,例如从 tablet 到桌面和从桌面到移动,则 tablet > 桌面 > 移动
示例数据如下:
+---------+-------+---------+
| user_id | step | device |
+---------+-------+---------+
| 7bc6de | step1 | desktop |
| 7bc6de | step2 | desktop |
| 7bc6de | step3 | mobile |
| 7bc6de | step4 | mobile |
| 7bc6de | step5 | desktop |
| 0ee6df | step1 | tablet |
| 0ee6df | step2 | tablet |
| 0ee6df | step3 | desktop |
| 0ee6df | step4 | desktop |
| 0ee6df | step5 | mobile |
+---------+-------+---------+
期望的输出:
+---------+-------+---------+---------------------------+
| user_id | step | device | device_concatenated |
+---------+-------+---------+---------------------------+
| 7bc6de | step1 | desktop | desktop > mobile |
| 7bc6de | step2 | desktop | desktop > mobile |
| 7bc6de | step3 | mobile | desktop > mobile |
| 7bc6de | step4 | mobile | desktop > mobile |
| 7bc6de | step5 | desktop | desktop > mobile |
| 0ee6df | step1 | tablet | tablet > desktop > mobile |
| 0ee6df | step2 | tablet | tablet > desktop > mobile |
| 0ee6df | step3 | desktop | tablet > desktop > mobile |
| 0ee6df | step4 | desktop | tablet > desktop > mobile |
| 0ee6df | step5 | mobile | tablet > desktop > mobile |
+---------+-------+---------+---------------------------+
附加场景:
在 table 中,存在重复的步骤,即用户可以在不同的时间使用不同的设备看到相同的步骤。在这种情况下,如何在预期结果中进行如下更改,让每个用户和设备迈出第一步?
+---------+-------+---------------------+---------+
| user_id | step | created_at | device |
+---------+-------+---------------------+---------+
| user1 | step1 | 2021-03-16 14:03:16 | mobile |
| user1 | step2 | 2021-03-16 14:04:07 | mobile |
| user1 | step2 | 2021-03-16 14:03:47 | desktop |
| user1 | step3 | 2021-03-16 14:03:55 | mobile |
| user1 | step3 | 2021-03-16 14:04:00 | mobile |
| user1 | step1 | 2021-03-16 14:04:02 | desktop |
| user1 | step2 | 2021-03-16 14:03:16 | mobile |
| user1 | step3 | 2021-03-16 14:04:07 | mobile |
| user1 | step4 | 2021-03-16 14:04:08 | desktop |
| user1 | step4 | 2021-03-16 14:04:09 | tablet |
+---------+-------+---------------------+---------+
预期结果:
+---------+-------+---------------------+---------+---------------------+
| user_id | step | created_at | device | device_concatenated |
+---------+-------+---------------------+---------+---------------------+
| user1 | step1 | 2021-03-16 14:03:16 | mobile | mobile > desktop |
| user1 | step2 | 2021-03-16 14:03:16 | mobile | mobile > desktop |
| user1 | step3 | 2021-03-16 14:03:55 | mobile | mobile > desktop |
| user1 | step4 | 2021-03-16 14:04:08 | desktop | mobile > desktop |
+---------+-------+---------------------+---------+---------------------+
https://www.db-fiddle.com/f/ooSmXAxqVHNxqD8sJ6wZfr/0
with first_seen_per_user_and_device AS (
select user_id, device, min(step) first_seen_step
from input_data
group by user_id, device
),
user_to_devices as(
SELECT user_id, array_to_string(
array_agg(device order by first_seen_step), ' > ') device_concatenated
from first_seen_per_user_and_device
group by 1
)
SELECT input_data.*, device_concatenated
from input_data
join user_to_devices
ON user_to_devices.user_id = input_data.user_id;
如果可以在多个设备上看到同一个用户和步骤,您需要添加一个额外的 WITH 子句以仅选择您想要的一个(例如,最早的一个),使用 SELECT DISTINCT
:
https://www.db-fiddle.com/f/w9ZRvpQ7KXgdVKCTDAb43o/0
WITH input_data as (
select distinct on (user_id, step) user_id, step, created_at, device
from input_data_with_created_at
ORDER BY user_id, step, created_at
),
(...) -- Rest of the CTEs, same as before but with timestamp included.