使用 IBM Cloud SQL 查询时如何展平 Parquet 数组数据类型

How to flatten an Parquet Array datatype when using IBM Cloud SQL Query

我必须将从 IBM Cloud SQL 查询中读取的 parquet 文件数据推送到 Cloud 上的 Db2。

我的 parquet 文件包含数组格式的数据,我也想将其推送到 DB2 on Cloud。

有什么方法可以将parquet文件的数组数据推送到云上的Db2吗?

您是否在文档中查看了此建议?

https://cloud.ibm.com/docs/services/sql-query?topic=sql-query-overview#limitations

If a JSON, ORC, or Parquet object contains a nested or arrayed structure, a query with CSV output using a wildcard (for example, SELECT * from cos://...) returns an error such as "Invalid CSV data type used: struct." Use one of the following workarounds:

  • For a nested structure, use the FLATTEN table transformation function.
  • Alternatively, you can specify the fully nested column names instead of the wildcard, for example, SELECT address.city, address.street, ... from cos://....
  • For an array, use the Spark SQL explode() function, for example, select explode(contact_names) from cos://....