PostgreSQL:使用临时表调用过程时出现分段错误

PostgreSQL: Segmentation Fault when calling procedure with temporary tables

对于没有更好的标题,我深表歉意。我不知道确切的问题是什么。

PostgreSQL 13.4
TimescaleDB 2.4.2
Ubuntu 18.04.6 LTS 运行 在 Docker 中(Docker 文件往下)
shared_memory16GB
shm 大小 2GB

底部的查询导致 postgres 关闭并出现错误:

 2022-05-09 15:17:42.012 UTC [1] LOG:  server process (PID 1316) was terminated by signal 11: Segmentation fault
 2022-05-09 15:17:42.012 UTC [1] DETAIL:  Failed process was running: CALL process_wifi_traffic_for_range('2022-01-10','2022-01-12')
 2022-05-09 15:17:42.012 UTC [1] LOG:  terminating any other active server processes
 2022-05-09 15:17:42.013 UTC [654] WARNING:  terminating connection because of crash of another server process
 2022-05-09 15:17:42.013 UTC [654] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
 2022-05-09 15:17:42.013 UTC [654] HINT:  In a moment you should be able to reconnect to the database and repeat your command.

我想每天处理交通数据,因为我的数据湖很大。 首先,我将未处理的日期加载到临时 table 中。 然后我提取之前和现在报告记录的流量之间的差异,因为来源只给出了应计的总流量。 tables wifi_traffic_last_traffic、t_last_traffic 只是为了跟踪上次程序 运行 时每个客户端表单的最后已知流量。 ht_wifi_traffic_processed 是一个 timescaledb hypertable,但是当我使用普通 table 时也会出现错误。如果出现内存问题,将时间范围从一天缩短到 1 小时也无济于事。有时1、2天搞定,搞定的数据是正确的。

查询有点长,但我不想遗漏任何内容:

DECLARE
    f_date date = from_date::date;
    t_date date = to_date::date;
BEGIN
    SET SESSION temp_buffers = '1GB';
    
    CREATE TEMPORARY TABLE t_last_traffic (
        clientid text,
        devicecpuid text,
        traffic bigint,
        PRIMARY KEY(clientid,devicecpuid)
    );
    
    INSERT INTO t_last_traffic (
        SELECT * FROM wifi_traffic_last_traffic
    );
    
    CREATE TEMPORARY TABLE unprocessed (
        "timestamp" timestamp,
        clientid text,
        devicecpuid text,
        customer text,
        traffic bigint
    ) ON COMMIT DELETE ROWS;
    
    CREATE TEMPORARY TABLE processed (
        "timestamp" timestamp,
        clientid text,
        devicecpuid text,
        customer text,
        traffic bigint,
        PRIMARY KEY("timestamp", clientid, devicecpuid)
    ) ON COMMIT DELETE ROWS;
    
    LOOP
        RAISE NOTICE 'Processing date: %', f_date;
    
        INSERT INTO unprocessed
        SELECT wt."timestamp", wt.clientid, wt.devicecpuid, wt.customer, wt.traffic 
        FROM wifi_traffic AS wt
        WHERE wt."timestamp"
              BETWEEN
                      f_date::timestamp
                  AND
                      f_date::timestamp + INTERVAL '1 day'
        ORDER BY
            devicecpuid ASC, --Important to sort by cpuID first as to not mix traffic results.
            clientid ASC,
            wt."timestamp" ASC;
        
        RAISE NOTICE 'Unprocessed import done';
        
        INSERT INTO processed
            SELECT * FROM (
                SELECT
                    up."timestamp",
                    up.clientid,
                    up.devicecpuid,
                    up.customer,
                    wifi_traffic_lag(
                        up.traffic, 
                        lag(
                            up.traffic, 
                            1, 
                            COALESCE(
                                (
                                    SELECT lt.traffic
                                    FROM t_last_traffic lt
                                    WHERE 
                                        lt.clientid = up.clientid
                                    AND
                                        lt.devicecpuid = up.devicecpuid
                                    FETCH FIRST ROW ONLY
                                ),
                                CAST(0 AS bigint)
                            )
                        )   
                        OVER (
                            PARTITION BY 
                                up.clientid, 
                                up.devicecpuid 
                            ORDER BY 
                                up.clientid, 
                                up.devicecpuid, 
                                up."timestamp"
                        )
                    ) AS traffic
                FROM unprocessed up
                WHERE
                    up.traffic != 0
            ) filtered
            WHERE
                filtered.traffic > 0
        ON CONFLICT ON CONSTRAINT processed_pkey DO NOTHING;
        
            
        RAISE NOTICE 'Processed import done';
            
            
        INSERT INTO t_last_traffic(devicecpuid, clientid, traffic)
            SELECT up.devicecpuid, up.clientid, MAX(up.traffic)
            FROM unprocessed up
            GROUP BY up.devicecpuid, up.clientid
        ON CONFLICT ON CONSTRAINT t_last_traffic_pkey DO UPDATE SET
            traffic = EXCLUDED.traffic;
            
        INSERT INTO ht_wifi_traffic_processed
        SELECT * FROM processed;
        
        TRUNCATE TABLE unprocessed;
        TRUNCATE TABLE processed;
        
        COMMIT;
        
        RAISE NOTICE 'Finished processing for date: %', f_date;
        
        f_date = f_date + 1;
        EXIT WHEN f_date > t_date;
    END LOOP;
    
    INSERT INTO wifi_traffic_last_traffic
    SELECT * FROM t_last_traffic
    ON CONFLICT ON CONSTRAINT wifi_traffic_last_traffic_pkey DO UPDATE SET
        traffic = EXCLUDED.traffic;
        
    DROP TABLE t_last_traffic;
    DROP TABLE unprocessed;
    DROP TABLE processed;
    
    COMMIT;
END

Docker撰写:

services:
  postgres-storage:
    image: <redacted>/postgres_gis_tdb:pg13_tdb2.4.2_gis3.1.4
    restart: unless-stopped
    shm_size: 2gb
    ports:
      - '5433:5432'
    networks:
      - bigdata
    volumes:
      - /mnt/data_storage/psql_data:/var/lib/postgresql/data
      - /usr/docker-volumes/postgres-storage:/var/lib/postgresql/ssd
    environment:
      POSTGRES_USER: postgres
    env_file:
      - .env

Docker文件:

FROM postgres:13

ENV POSTGIS_MAJOR 3
ENV POSTGIS_VERSION 3.1.4+dfsg-1.pgdg100+1


### INSTALL POSTGIS ###

RUN apt-get update \
      && apt-cache showpkg postgresql-$PG_MAJOR-postgis-$POSTGIS_MAJOR \
      && apt-get install -y --no-install-recommends \
           postgresql-$PG_MAJOR-postgis-$POSTGIS_MAJOR=$POSTGIS_VERSION \
           postgresql-$PG_MAJOR-postgis-$POSTGIS_MAJOR-scripts \
      && rm -rf /var/lib/apt/lists/*

RUN mkdir -p /docker-entrypoint-initdb.d
COPY ./initdb-postgis.sh /docker-entrypoint-initdb.d/10_postgis.sh
COPY ./update-postgis.sh /usr/local/bin


###         INSTALL TIMESCALEDB           ###
# Important: Run timescaledb-tune           #
# once for COMPLETELY NEW DATABASES,        #
# so no existing postgresql_data.           #
###                                       ###
RUN apt-get update \
      && apt-get install -y postgresql-common wget lsb-release
RUN echo "yes" | sh /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
RUN sh -c "echo 'deb [signed-by=/usr/share/keyrings/timescale.keyring] https://packagecloud.io/timescale/timescaledb/debian/ $(ls$
RUN wget --quiet -O - https://packagecloud.io/timescale/timescaledb/gpgkey |  gpg --dearmor -o /usr/share/keyrings/timescale.keyr$
RUN apt-get update \
      && apt-get install -y timescaledb-2-postgresql-13 timescaledb-tools

似乎错误来自于

CREATE TEMPORARY TABLE unprocessed (
        "timestamp" timestamp,
        clientid text,
        devicecpuid text,
        customer text,
        traffic bigint
    ) ON COMMIT DELETE ROWS; 

以及:

TRUNCATE TABLE unprocessed;

我最初这样做是因为测试表明 ON COMMIT DELETE ROWS 在程序中间的 COMMIT 之后并没有真正清除 table。 将它排除在外可以防止错误发生,进一步的测试表明即使没有它,数据也符合预期。这似乎是某种竞争条件。 我也会在 postgres github 中 post 这个。