如何从 Beam SQL (SqlTransform) 输出嵌套行?
How to output nested Row from Beam SQL (SqlTransform)?
我想从 Beam SQL (SqlTransform) 的输出中获得带嵌套行的行,但失败了。
问题:
- 从SqlTransform 输出带有嵌套行的行的正确方法是什么? (行类型在the docs中有描述,所以我相信它是受支持的)
- 如果这是bug/missing的特性,是不是Beam本身的问题?还是跑步者依赖? (我目前在 DirectRunner 上使用,但将来会使用 DataflowRunner。)
版本信息:
- OS: macOS 10.15.7 (卡特琳娜)
- Java: 11.0.11 (采用OpenJDK)
- Beam SDK:2.32.0
这是我尝试过的方法,但没有成功。
方解石方言
SELECT ROW(foo, bar) as my_nested_row FROM PCOLLECTION
我期望此输出行具有以下架构
Field{name=my_nested_row, description=, type=ROW<foo STRING NOT NULL, bar INT64 NOT NULL> NOT NULL, options={{}}}
但实际上行被分成标量字段,如
Field{name=my_nested_row$[=13=], description=, type=STRING NOT NULL, options={{}}}
Field{name=my_nested_row$, description=, type=INT64 NOT NULL, options={{}}}
泽塔 SQL
SELECT STRUCT(foo, bar) as my_nested_row FROM PCOLLECTION
我收到一个错误
java.lang.UnsupportedOperationException: Does not support expr node kind RESOLVED_MAKE_STRUCT
at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromResolvedExpr (ExpressionConverter.java:363)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromResolvedExpr (ExpressionConverter.java:323)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromComputedColumnWithFieldList (ExpressionConverter.java:375)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.retrieveRexNode (ExpressionConverter.java:203)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.ProjectScanConverter.convert (ProjectScanConverter.java:45)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.ProjectScanConverter.convert (ProjectScanConverter.java:29)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convertNode (QueryStatementConverter.java:102)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convert (QueryStatementConverter.java:89)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convertRootQuery (QueryStatementConverter.java:55)
at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLPlannerImpl.rel (ZetaSQLPlannerImpl.java:98)
at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRelInternal (ZetaSQLQueryPlanner.java:197)
at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRel (ZetaSQLQueryPlanner.java:185)
at org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv.parseQuery (BeamSqlEnv.java:111)
at org.apache.beam.sdk.extensions.sql.SqlTransform.expand (SqlTransform.java:171)
at org.apache.beam.sdk.extensions.sql.SqlTransform.expand (SqlTransform.java:109)
at org.apache.beam.sdk.Pipeline.applyInternal (Pipeline.java:548)
at org.apache.beam.sdk.Pipeline.applyTransform (Pipeline.java:482)
at org.apache.beam.sdk.values.PCollection.apply (PCollection.java:363)
at dev.tmshn.playbeam.Main.main (Main.java:29)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke (Method.java:566)
at org.codehaus.mojo.exec.ExecJavaMojo.run (ExecJavaMojo.java:282)
at java.lang.Thread.run (Thread.java:829)
不幸的是,Beam SQL 尚不支持嵌套行,主要是由于缺乏对 Calcite 的支持(因此相应缺乏对 ZetaSQL 实现的支持)。看到这个 .
好的方面是,Jira issue tracking this support 似乎已针对 2.34.0 解决,因此可能即将提供适当的支持。
我想从 Beam SQL (SqlTransform) 的输出中获得带嵌套行的行,但失败了。
问题:
- 从SqlTransform 输出带有嵌套行的行的正确方法是什么? (行类型在the docs中有描述,所以我相信它是受支持的)
- 如果这是bug/missing的特性,是不是Beam本身的问题?还是跑步者依赖? (我目前在 DirectRunner 上使用,但将来会使用 DataflowRunner。)
版本信息:
- OS: macOS 10.15.7 (卡特琳娜)
- Java: 11.0.11 (采用OpenJDK)
- Beam SDK:2.32.0
这是我尝试过的方法,但没有成功。
方解石方言
SELECT ROW(foo, bar) as my_nested_row FROM PCOLLECTION
我期望此输出行具有以下架构
Field{name=my_nested_row, description=, type=ROW<foo STRING NOT NULL, bar INT64 NOT NULL> NOT NULL, options={{}}}
但实际上行被分成标量字段,如
Field{name=my_nested_row$[=13=], description=, type=STRING NOT NULL, options={{}}}
Field{name=my_nested_row$, description=, type=INT64 NOT NULL, options={{}}}
泽塔 SQL
SELECT STRUCT(foo, bar) as my_nested_row FROM PCOLLECTION
我收到一个错误
java.lang.UnsupportedOperationException: Does not support expr node kind RESOLVED_MAKE_STRUCT
at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromResolvedExpr (ExpressionConverter.java:363)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromResolvedExpr (ExpressionConverter.java:323)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromComputedColumnWithFieldList (ExpressionConverter.java:375)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.retrieveRexNode (ExpressionConverter.java:203)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.ProjectScanConverter.convert (ProjectScanConverter.java:45)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.ProjectScanConverter.convert (ProjectScanConverter.java:29)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convertNode (QueryStatementConverter.java:102)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convert (QueryStatementConverter.java:89)
at org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convertRootQuery (QueryStatementConverter.java:55)
at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLPlannerImpl.rel (ZetaSQLPlannerImpl.java:98)
at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRelInternal (ZetaSQLQueryPlanner.java:197)
at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRel (ZetaSQLQueryPlanner.java:185)
at org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv.parseQuery (BeamSqlEnv.java:111)
at org.apache.beam.sdk.extensions.sql.SqlTransform.expand (SqlTransform.java:171)
at org.apache.beam.sdk.extensions.sql.SqlTransform.expand (SqlTransform.java:109)
at org.apache.beam.sdk.Pipeline.applyInternal (Pipeline.java:548)
at org.apache.beam.sdk.Pipeline.applyTransform (Pipeline.java:482)
at org.apache.beam.sdk.values.PCollection.apply (PCollection.java:363)
at dev.tmshn.playbeam.Main.main (Main.java:29)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke (Method.java:566)
at org.codehaus.mojo.exec.ExecJavaMojo.run (ExecJavaMojo.java:282)
at java.lang.Thread.run (Thread.java:829)
不幸的是,Beam SQL 尚不支持嵌套行,主要是由于缺乏对 Calcite 的支持(因此相应缺乏对 ZetaSQL 实现的支持)。看到这个
好的方面是,Jira issue tracking this support 似乎已针对 2.34.0 解决,因此可能即将提供适当的支持。