如何用Presto/Trino物理删除数据?

How to delete data physically with Presto/Trino?

在我安装的 Presto (358) 中,我有两个工作蜂巢连接器:

一切正常,但当我调用 DROP (TABLE/SCHEMA)DELETE FROM 时,删除仅发生在 Metastore 中,没有数据被物理删除。适用于 S3 和 ABFS。

在替换数据的情况下,这会变得很成问题:

> DROP TABLE hive.abc; 
-- ok

> CREATE TABLE hive.abc AS (...) 
-- ERROR: Target directory 'abc' already exists.

删除分区等同理

有没有办法真正删除数据?

找到解决方案。主要区别在于为架构及其 tables.

指定 external_locationlocation
CREATE SCHEMA hive.xyz WITH (location = 'abfs://...');
CREATE TABLE hive.xyz.test AS SELECT (...);

DELETE FROM hive.xyz.test WHERE TRUE;

-- Data ARE physically deleted

CREATE SCHEMA hive.xyz;
CREATE TABLE hive.xyz.test 
    WITH (external_location = 'abfs://...') 
    AS SELECT (...);

DELETE FROM hive.xyz.test WHERE TRUE;

-- Data ARE NOT physically deleted.

结论:external_location table 将防止数据删除。