数百万条记录的 Postgresql 数据库变慢（选择）

Question

Postgresql 数据库在尝试获取数百万条记录的数据时变慢。我尝试使用 实体化视图，但是性能非常快，但它不提供实时数据。

我也在使用聚合ex。求和、计数、分组等...

SELECT offer_id as off_id,
  COUNT(distinct ip) as hosts,
  COUNT(distinct click_id) as clicks 
FROM offer_affiliate_stats 
WHERE 
  created_dt >= '2019-06-01' 
AND 
  created_dt  <= '2019-06-30' 
GROUP BY off_id;

我试过物化视图。

索引应用于 id，created_dt，click_id

我的输出应该是这样的：

off_id               | 79
hosts                | 4
clicks               | 4
offer_name           | "Testing Javelin"
offer_id             | 
total_conversions    | 
total_income         | 
optimised_count      | 
optimised_income     | 
approved_income      | 
approved_conversions | 
declined_income      | 
declined_conversions | 
total_payout         |

实际上不使用 distinct 关键字它可以完美地工作但是当我使用 distinct 时它需要很长时间。

Answer 1

你应该配置你的数据库吗？

你看下面link：https://www.postgresql.org/docs/current/runtime-config-resource.html

特别是work_mem默认值为4MB。您可以增加到 100MB。

您将代码更改为：

SELECT  offer_id as off_id,
        COUNT(ip) as hosts,
        COUNT(click_id) as clicks 

FROM 
(select distinct offer_id,
        ip ,
        click_id    
        from offer_affiliate_stats 
WHERE created_dt >= '2019-06-01' 
AND created_dt  <= '2019-06-30' ) as t
GROUP BY off_id;

数百万条记录的 Postgresql 数据库变慢（选择）

Postgresql database getting slow (selection) for millions records

postgresql

query-performance