Postgres 聚合嵌套的 jsonb 数组值
Postgres aggregate nested jsonb array values
在 Postgres 11.x 中,我试图将具有数组字段的嵌套 jsonb 对象中的元素聚合到每个 device_id 的一行中。这是名为 configurations
.
的 table 的示例数据
id
device_id
data
1
1
"{""sensors"": [{""other_data"": {}, ""sensor_type"": 1}], ""other_data"": {}}"
2
1
"{""sensors"": [{""other_data"": {}, ""sensor_type"": 1}, {""other_data"": {}, ""sensor_type"": 2}], ""other_data"": {}}"
3
1
"{""sensors"": [{""other_data"": {}, ""sensor_type"": 3}], ""other_data"": {}}"
4
2
"{""sensors"": [{""other_data"": {}, ""sensor_type"": 4}], ""other_data"": {}}"
5
2
"{""sensors"": null, ""other_data"": {}}"
6
3
"{""sensors"": [], ""other_data"": {}}"
我的目标输出是每个 device_id 一行,其中包含一组不同的 sensor_types,示例:
device_id
sensor_types
1
[1,2,3]
2
[4]
3
[ ] null would also be fine here
尝试了很多东西但是 运行 遇到了各种问题,这里有一些 SQL 来设置测试环境:
CREATE TEMPORARY TABLE configurations(
id SERIAL PRIMARY KEY,
device_id SERIAL,
data JSONB
);
INSERT INTO configurations(device_id, data) VALUES
(1, '{ "other_data": {}, "sensors": [ { "sensor_type": 1, "other_data": {} } ] }'),
(1, '{ "other_data": {}, "sensors": [ { "sensor_type": 1, "other_data": {} }, { "sensor_type": 2, "other_data": {} }] }'),
(1, '{ "other_data": {}, "sensors": [ { "sensor_type": 3, "other_data": {} }] }'),
(2, '{ "other_data": {}, "sensors": [ { "sensor_type": 4, "other_data": {} }] }'),
(2, '{ "other_data": {}, "sensors": null }'),
(3, '{ "other_data": {}, "sensors": [] }');
快速说明,我的真实 table 大约有 100,000 行,jsonb 数据要复杂得多,但遵循这个一般结构。
JSONB null
在 Postgres 中引起了一些问题,应该尽可能避免。您可以使用表达式
将值转换为空数组
coalesce(nullif(data->'sensors', 'null'), '[]')
第一次尝试:
select device_id, array_agg(distinct value->'sensor_type') as sensor_types
from configurations
left join jsonb_array_elements(coalesce(nullif(data->'sensors', 'null'), '[]')) on true
group by device_id;
device_id | sensor_types
-----------+--------------
1 | {1,2,3}
2 | {4,NULL}
3 | {NULL}
(3 rows)
可能因为结果nulls
不尽如人意。尝试删除它们时
select device_id, array_agg(distinct value->'sensor_type') as sensor_types
from configurations
left join jsonb_array_elements(coalesce(nullif(data->'sensors', 'null'), '[]')) on true
where value is not null
group by device_id;
device_id | sensor_types
-----------+--------------
1 | {1,2,3}
2 | {4}
(2 rows)
device_id = 3
消失。好吧,我们可以从 table:
中得到所有的 device_ids
select distinct device_id, sensor_types
from configurations
left join (
select device_id, array_agg(distinct value->'sensor_type') as sensor_types
from configurations
left join jsonb_array_elements(coalesce(nullif(data->'sensors', 'null'), '[]')) on true
where value is not null
group by device_id
) s
using(device_id);
device_id | sensor_types
-----------+--------------
1 | {1,2,3}
2 | {4}
3 |
(3 rows)
在 Postgres 11.x 中,我试图将具有数组字段的嵌套 jsonb 对象中的元素聚合到每个 device_id 的一行中。这是名为 configurations
.
id | device_id | data |
---|---|---|
1 | 1 | "{""sensors"": [{""other_data"": {}, ""sensor_type"": 1}], ""other_data"": {}}" |
2 | 1 | "{""sensors"": [{""other_data"": {}, ""sensor_type"": 1}, {""other_data"": {}, ""sensor_type"": 2}], ""other_data"": {}}" |
3 | 1 | "{""sensors"": [{""other_data"": {}, ""sensor_type"": 3}], ""other_data"": {}}" |
4 | 2 | "{""sensors"": [{""other_data"": {}, ""sensor_type"": 4}], ""other_data"": {}}" |
5 | 2 | "{""sensors"": null, ""other_data"": {}}" |
6 | 3 | "{""sensors"": [], ""other_data"": {}}" |
我的目标输出是每个 device_id 一行,其中包含一组不同的 sensor_types,示例:
device_id | sensor_types |
---|---|
1 | [1,2,3] |
2 | [4] |
3 | [ ] null would also be fine here |
尝试了很多东西但是 运行 遇到了各种问题,这里有一些 SQL 来设置测试环境:
CREATE TEMPORARY TABLE configurations(
id SERIAL PRIMARY KEY,
device_id SERIAL,
data JSONB
);
INSERT INTO configurations(device_id, data) VALUES
(1, '{ "other_data": {}, "sensors": [ { "sensor_type": 1, "other_data": {} } ] }'),
(1, '{ "other_data": {}, "sensors": [ { "sensor_type": 1, "other_data": {} }, { "sensor_type": 2, "other_data": {} }] }'),
(1, '{ "other_data": {}, "sensors": [ { "sensor_type": 3, "other_data": {} }] }'),
(2, '{ "other_data": {}, "sensors": [ { "sensor_type": 4, "other_data": {} }] }'),
(2, '{ "other_data": {}, "sensors": null }'),
(3, '{ "other_data": {}, "sensors": [] }');
快速说明,我的真实 table 大约有 100,000 行,jsonb 数据要复杂得多,但遵循这个一般结构。
JSONB null
在 Postgres 中引起了一些问题,应该尽可能避免。您可以使用表达式
coalesce(nullif(data->'sensors', 'null'), '[]')
第一次尝试:
select device_id, array_agg(distinct value->'sensor_type') as sensor_types
from configurations
left join jsonb_array_elements(coalesce(nullif(data->'sensors', 'null'), '[]')) on true
group by device_id;
device_id | sensor_types
-----------+--------------
1 | {1,2,3}
2 | {4,NULL}
3 | {NULL}
(3 rows)
可能因为结果nulls
不尽如人意。尝试删除它们时
select device_id, array_agg(distinct value->'sensor_type') as sensor_types
from configurations
left join jsonb_array_elements(coalesce(nullif(data->'sensors', 'null'), '[]')) on true
where value is not null
group by device_id;
device_id | sensor_types
-----------+--------------
1 | {1,2,3}
2 | {4}
(2 rows)
device_id = 3
消失。好吧,我们可以从 table:
device_ids
select distinct device_id, sensor_types
from configurations
left join (
select device_id, array_agg(distinct value->'sensor_type') as sensor_types
from configurations
left join jsonb_array_elements(coalesce(nullif(data->'sensors', 'null'), '[]')) on true
where value is not null
group by device_id
) s
using(device_id);
device_id | sensor_types
-----------+--------------
1 | {1,2,3}
2 | {4}
3 |
(3 rows)