在 Athena 中查询 S3 清单详细信息
Querying S3 inventory details in Athena
我在 S3 存储桶中有 S3 清单详细信息,我正在通过 Athena 查询它。
我的前两列如下所示:
bucket key
bke-p0d-bke-lca-data dl/xxxxxx/plant/archive/01-01-2019/1546300856.json
bke-pod-bke-lca-data dl/xxxx/plant/archive/01-01-2019/1546300856.json
bke-pod-bke-lca-data dl/xxx/plant/archive/01-01-2019/1546300856.json
我需要他们将关键信息拆分为以下内容:
bucket Categ Type Date File
bke-pod-bke-lca-data xxxxxx archive 01/01/2019 1546300856.json
bke-pod-bke-lca-data xxxx working 01/01/2019 1546300856.json
bke-pod-bke-lca-data xxx archive 01/01/2019 1546300856.json
我试过了substr
没用。
如何根据 /
拆分?
6.8. String Functions and Operators — Presto 0.172 Documentation 有:
split_part(string, delimiter, index)
Splits string
on delimiter
and returns the field index
. Field indexes start with 1
. If the index is larger than than the number of fields, then null is returned.
所以,您应该可以使用类似的东西:
SELECT
bucket,
split_part(key, '/', 2) as category,
split_part(key, '/', 4) as type,
split_part(key, '/', 5) as date,
split_part(key, '/', 6) as file
我在 S3 存储桶中有 S3 清单详细信息,我正在通过 Athena 查询它。
我的前两列如下所示:
bucket key
bke-p0d-bke-lca-data dl/xxxxxx/plant/archive/01-01-2019/1546300856.json
bke-pod-bke-lca-data dl/xxxx/plant/archive/01-01-2019/1546300856.json
bke-pod-bke-lca-data dl/xxx/plant/archive/01-01-2019/1546300856.json
我需要他们将关键信息拆分为以下内容:
bucket Categ Type Date File
bke-pod-bke-lca-data xxxxxx archive 01/01/2019 1546300856.json
bke-pod-bke-lca-data xxxx working 01/01/2019 1546300856.json
bke-pod-bke-lca-data xxx archive 01/01/2019 1546300856.json
我试过了substr
没用。
如何根据 /
拆分?
6.8. String Functions and Operators — Presto 0.172 Documentation 有:
split_part(string, delimiter, index)
Splitsstring
ondelimiter
and returns the fieldindex
. Field indexes start with1
. If the index is larger than than the number of fields, then null is returned.
所以,您应该可以使用类似的东西:
SELECT
bucket,
split_part(key, '/', 2) as category,
split_part(key, '/', 4) as type,
split_part(key, '/', 5) as date,
split_part(key, '/', 6) as file