在 Athena 中查询 S3 清单详细信息

Querying S3 inventory details in Athena

我在 S3 存储桶中有 S3 清单详细信息,我正在通过 Athena 查询它。

我的前两列如下所示:

bucket                  key
bke-p0d-bke-lca-data    dl/xxxxxx/plant/archive/01-01-2019/1546300856.json
bke-pod-bke-lca-data    dl/xxxx/plant/archive/01-01-2019/1546300856.json
bke-pod-bke-lca-data    dl/xxx/plant/archive/01-01-2019/1546300856.json

我需要他们将关键信息拆分为以下内容:

bucket                  Categ   Type    Date        File
bke-pod-bke-lca-data    xxxxxx  archive 01/01/2019  1546300856.json
bke-pod-bke-lca-data    xxxx    working 01/01/2019  1546300856.json
bke-pod-bke-lca-data    xxx     archive 01/01/2019  1546300856.json

我试过了substr没用。

如何根据 / 拆分?

6.8. String Functions and Operators — Presto 0.172 Documentation 有:

split_part(string, delimiter, index) Splits string on delimiter and returns the field index. Field indexes start with 1. If the index is larger than than the number of fields, then null is returned.

所以,您应该可以使用类似的东西:

SELECT
  bucket,
  split_part(key, '/', 2) as category,
  split_part(key, '/', 4) as type,
  split_part(key, '/', 5) as date,
  split_part(key, '/', 6) as file