Dataflow SQL (GCP) 不支持使用 STRUCT 的嵌套行

Nested rows using STRUCT are not supported in Dataflow SQL (GCP)

使用 Dataflow SQL 我想阅读 Pub/Sub 主题,丰富消息并将消息写入 Pub/Sub 主题。

哪个 Dataflow SQL 查询会创建我想要的输出消息?

Pub/Sub 输入 消息:{"event_timestamp":1619784049000, "设备":{"ID":"some_id" }}

期望的Pub/Sub输出消息:{“event_timestamp”:1619784049000,“设备”:{“ID”:some_id ", “姓名”:”some_name”}}

我得到的是:{"event_timestamp":1619784049000, "device":{"ID":"some_id"}, "NAME":"some_name" }

但我需要 “设备”属性中的名称。

SELECT message_table.device as device, devices.name as NAME 
FROM pubsub.topic.project_id.`topic` as message_table
  JOIN bigquery.table.project_id.dataflow_sql_dataset.devices as devices 
  ON devices.device_id = message_table.device.id

您需要在投影中创建一个结构体(SELECT 部分)

SELECT STRUCT(message_table.device.ID as ID , devices.name as NAME) as device
FROM pubsub.topic.project_id.`topic` as message_table
  JOIN bigquery.table.project_id.dataflow_sql_dataset.devices as devices 
  ON devices.device_id = message_table.device.id

很遗憾,Dataflow SQL 目前不支持 STRUCT/Sub 查询,但我们正在努力解决。由于有一些 Apache Beam 依赖项阻止了它的进展 (Nested Rows Support, Upgrading Calcite), we cannot provide an ETA at the moment, but you can follow its progress on this issue tracker.