行级安全性,性能差
Row Level Security, poor performance
我正在评估使用 PostgreSQL 的行级安全 (RLS) 功能软删除客户的可能性。不幸的是,我遇到了性能不佳的问题。这是 PostgreSQL 版本 9.5.10 中的一个简单测试设置:
一个 table 包含 10,000,000 个客户:
CREATE TABLE customers (
customer_id integer PRIMARY KEY,
name text,
hidden boolean DEFAULT FALSE
);
INSERT INTO customers (customer_id, name) SELECT generate_series(0, 9999999), 'John Doe';
ANALYZE customers;
一个 table 包含每个客户的一个订单:
CREATE TABLE orders (
order_id integer PRIMARY KEY,
customer_id integer REFERENCES customers (customer_id)
);
INSERT INTO orders (order_id, customer_id) SELECT generate_series(0, 9999999), generate_series(0, 9999999);
ANALYZE orders;
将仅执行 SELECT 的不受信任的用户:
CREATE ROLE untrusted;
GRANT SELECT ON customers TO untrusted;
GRANT SELECT ON orders TO untrusted;
使隐藏客户对不受信任的用户不可见的策略:
CREATE POLICY no_hidden_customers ON customers FOR SELECT TO untrusted USING (hidden IS FALSE);
ALTER TABLE customers ENABLE ROW LEVEL SECURITY;
一个简单的测试查询:使用 order_id = 4711 下订单的客户姓名是什么?
没有 RLS:
EXPLAIN ANALYZE SELECT name FROM orders JOIN customers USING (customer_id) WHERE order_id = 4711;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.87..16.92 rows=1 width=9) (actual time=0.121..0.123 rows=1 loops=1)
-> Index Scan using orders_pkey on orders (cost=0.43..8.45 rows=1 width=4) (actual time=0.078..0.078 rows=1 loops=1)
Index Cond: (order_id = 4711)
-> Index Scan using customers_pkey on customers (cost=0.43..8.45 rows=1 width=13) (actual time=0.039..0.040 rows=1 loops=1)
Index Cond: (customer_id = orders.customer_id)
Planning time: 0.476 ms
Execution time: 0.153 ms
(7 rows)
使用 RLS:
EXPLAIN ANALYZE SELECT name FROM orders JOIN customers USING (customer_id) WHERE order_id = 4711;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=8.46..291563.48 rows=1 width=9) (actual time=1.494..2565.121 rows=1 loops=1)
Hash Cond: (customers.customer_id = orders.customer_id)
-> Seq Scan on customers (cost=0.00..154055.00 rows=10000000 width=13) (actual time=0.010..1784.086 rows=10000000 loops=1)
Filter: (hidden IS FALSE)
-> Hash (cost=8.45..8.45 rows=1 width=4) (actual time=0.015..0.015 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Index Scan using orders_pkey on orders (cost=0.43..8.45 rows=1 width=4) (actual time=0.012..0.013 rows=1 loops=1)
Index Cond: (order_id = 4711)
Planning time: 0.358 ms
Execution time: 2565.170 ms
(10 rows)
加入table时如何避免顺序扫描?我已经尝试了所有我能想到的索引,但都无济于事。
我建议您升级到最新的 Postgres 版本 10.3。
自版本 9.5 以来,行级安全特性的性能得到了显着改进。
例如,查看自 Postgres 10.0 以来才可用的改进:https://github.com/postgres/postgres/commit/215b43cdc8d6b4a1700886a39df1ee735cb0274d
我认为在 Postgres 9.5 中尝试优化 RLS 查询没有意义,因为它在当时是一个非常新的功能,并且当时还没有真正针对性能进行优化。只是升级。
我正在评估使用 PostgreSQL 的行级安全 (RLS) 功能软删除客户的可能性。不幸的是,我遇到了性能不佳的问题。这是 PostgreSQL 版本 9.5.10 中的一个简单测试设置:
一个 table 包含 10,000,000 个客户:
CREATE TABLE customers (
customer_id integer PRIMARY KEY,
name text,
hidden boolean DEFAULT FALSE
);
INSERT INTO customers (customer_id, name) SELECT generate_series(0, 9999999), 'John Doe';
ANALYZE customers;
一个 table 包含每个客户的一个订单:
CREATE TABLE orders (
order_id integer PRIMARY KEY,
customer_id integer REFERENCES customers (customer_id)
);
INSERT INTO orders (order_id, customer_id) SELECT generate_series(0, 9999999), generate_series(0, 9999999);
ANALYZE orders;
将仅执行 SELECT 的不受信任的用户:
CREATE ROLE untrusted;
GRANT SELECT ON customers TO untrusted;
GRANT SELECT ON orders TO untrusted;
使隐藏客户对不受信任的用户不可见的策略:
CREATE POLICY no_hidden_customers ON customers FOR SELECT TO untrusted USING (hidden IS FALSE);
ALTER TABLE customers ENABLE ROW LEVEL SECURITY;
一个简单的测试查询:使用 order_id = 4711 下订单的客户姓名是什么?
没有 RLS:
EXPLAIN ANALYZE SELECT name FROM orders JOIN customers USING (customer_id) WHERE order_id = 4711;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.87..16.92 rows=1 width=9) (actual time=0.121..0.123 rows=1 loops=1)
-> Index Scan using orders_pkey on orders (cost=0.43..8.45 rows=1 width=4) (actual time=0.078..0.078 rows=1 loops=1)
Index Cond: (order_id = 4711)
-> Index Scan using customers_pkey on customers (cost=0.43..8.45 rows=1 width=13) (actual time=0.039..0.040 rows=1 loops=1)
Index Cond: (customer_id = orders.customer_id)
Planning time: 0.476 ms
Execution time: 0.153 ms
(7 rows)
使用 RLS:
EXPLAIN ANALYZE SELECT name FROM orders JOIN customers USING (customer_id) WHERE order_id = 4711;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=8.46..291563.48 rows=1 width=9) (actual time=1.494..2565.121 rows=1 loops=1)
Hash Cond: (customers.customer_id = orders.customer_id)
-> Seq Scan on customers (cost=0.00..154055.00 rows=10000000 width=13) (actual time=0.010..1784.086 rows=10000000 loops=1)
Filter: (hidden IS FALSE)
-> Hash (cost=8.45..8.45 rows=1 width=4) (actual time=0.015..0.015 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Index Scan using orders_pkey on orders (cost=0.43..8.45 rows=1 width=4) (actual time=0.012..0.013 rows=1 loops=1)
Index Cond: (order_id = 4711)
Planning time: 0.358 ms
Execution time: 2565.170 ms
(10 rows)
加入table时如何避免顺序扫描?我已经尝试了所有我能想到的索引,但都无济于事。
我建议您升级到最新的 Postgres 版本 10.3。 自版本 9.5 以来,行级安全特性的性能得到了显着改进。 例如,查看自 Postgres 10.0 以来才可用的改进:https://github.com/postgres/postgres/commit/215b43cdc8d6b4a1700886a39df1ee735cb0274d
我认为在 Postgres 9.5 中尝试优化 RLS 查询没有意义,因为它在当时是一个非常新的功能,并且当时还没有真正针对性能进行优化。只是升级。