如何从 Beam SQL (SqlTransform) 输出嵌套行?

How to output nested Row from Beam SQL (SqlTransform)?

我想从 Beam SQL (SqlTransform) 的输出中获得带嵌套行的行,但失败了。

问题:

  1. 从SqlTransform 输出带有嵌套行的行的正确方法是什么? (行类型在the docs中有描述,所以我相信它是受支持的)
  2. 如果这是bug/missing的特性,是不是Beam本身的问题?还是跑步者依赖? (我目前在 DirectRunner 上使用,但将来会使用 DataflowRunner。)

版本信息:

这是我尝试过的方法,但没有成功。

方解石方言

SELECT ROW(foo, bar) as my_nested_row FROM PCOLLECTION

我期望此输出行具有以下架构

Field{name=my_nested_row, description=, type=ROW<foo STRING NOT NULL, bar INT64 NOT NULL> NOT NULL, options={{}}}

但实际上行被分成标量字段,如

Field{name=my_nested_row$[=13=], description=, type=STRING NOT NULL, options={{}}}
Field{name=my_nested_row$, description=, type=INT64 NOT NULL, options={{}}}

泽塔 SQL

SELECT STRUCT(foo, bar) as my_nested_row FROM PCOLLECTION

我收到一个错误

java.lang.UnsupportedOperationException: Does not support expr node kind RESOLVED_MAKE_STRUCT
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromResolvedExpr (ExpressionConverter.java:363)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromResolvedExpr (ExpressionConverter.java:323)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromComputedColumnWithFieldList (ExpressionConverter.java:375)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.retrieveRexNode (ExpressionConverter.java:203)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.ProjectScanConverter.convert (ProjectScanConverter.java:45)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.ProjectScanConverter.convert (ProjectScanConverter.java:29)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convertNode (QueryStatementConverter.java:102)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convert (QueryStatementConverter.java:89)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convertRootQuery (QueryStatementConverter.java:55)
    at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLPlannerImpl.rel (ZetaSQLPlannerImpl.java:98)
    at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRelInternal (ZetaSQLQueryPlanner.java:197)
    at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRel (ZetaSQLQueryPlanner.java:185)
    at org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv.parseQuery (BeamSqlEnv.java:111)
    at org.apache.beam.sdk.extensions.sql.SqlTransform.expand (SqlTransform.java:171)
    at org.apache.beam.sdk.extensions.sql.SqlTransform.expand (SqlTransform.java:109)
    at org.apache.beam.sdk.Pipeline.applyInternal (Pipeline.java:548)
    at org.apache.beam.sdk.Pipeline.applyTransform (Pipeline.java:482)
    at org.apache.beam.sdk.values.PCollection.apply (PCollection.java:363)
    at dev.tmshn.playbeam.Main.main (Main.java:29)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:566)
    at org.codehaus.mojo.exec.ExecJavaMojo.run (ExecJavaMojo.java:282)
    at java.lang.Thread.run (Thread.java:829)

不幸的是,Beam SQL 尚不支持嵌套行,主要是由于缺乏对 Calcite 的支持(因此相应缺乏对 ZetaSQL 实现的支持)。看到这个 .

好的方面是,Jira issue tracking this support 似乎已针对 2.34.0 解决,因此可能即将提供适当的支持。