Postgres 不一致的插入性能
Postgres inconsistent insertion performance
我使用 JDBC 插入 postgres。由于某种原因,性能似乎在 运行 之间有所不同。我观察到它在 2 个性能级别之间切换,“快”和慢 10 倍左右。
我已经用 postgres 14.2、14.3 和 13.6 测试了这个问题。它似乎与所有版本都相关。
这种行为的可能原因是什么?
MRE (java 17):
package test;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.sql.Connection;
import java.sql.DriverManager;
public class App {
private static final Logger log = LoggerFactory.getLogger(App.class);
public static void main(String[] args) throws Exception {
Class.forName("org.postgresql.Driver");
try (Connection connection = DriverManager.getConnection("jdbc:postgresql://172.17.0.3:5432/postgres", "postgres", "postgres")) {
connection.setAutoCommit(false);
var uploadedCount = 0;
final var insertA = connection.prepareStatement("insert into A(AA) values(?)");
final var insertB = connection.prepareStatement("insert into B(BA, BB) values(?, ?)");
final var startTime = System.currentTimeMillis();
var timeA = 0;
var timeB = 0;
var batchSize = 0;
var batchesCount = 0;
for (long i = 0; i < 16_384; i++) {
insertA.setObject(1, i);
insertA.addBatch();
insertB.setObject(1, i);
insertB.setObject(2, i);
insertB.addBatch();
batchSize++;
if (batchSize == 32) {
var start = System.currentTimeMillis();
insertA.executeBatch();
timeA += System.currentTimeMillis() - start;
start = System.currentTimeMillis();
insertB.executeBatch();
timeB += System.currentTimeMillis() - start;
uploadedCount += batchSize * 2;
batchSize = 0;
batchesCount++;
if (batchesCount % 256 == 0) {
log.info("INSERT A CUMULATIVE TIME: {}", timeA);
log.info("INSERT B CUMULATIVE TIME: {}", timeB);
log.info("ROWS PER SEC: {}", uploadedCount * 1000L / (System.currentTimeMillis() - startTime));
}
}
}
}
}
}
MRE 架构:
CREATE TABLE A (AA BIGINT NOT NULL, CONSTRAINT PK_AA PRIMARY KEY (AA));
CREATE TABLE B (BA BIGINT NOT NULL, BB BIGINT NOT NULL, CONSTRAINT PK_BA_BB PRIMARY KEY (BA, BB));
ALTER TABLE B ADD CONSTRAINT FK_BA_A FOREIGN KEY (BA) REFERENCES A (AA);
ALTER TABLE B ADD CONSTRAINT FK_BB_A FOREIGN KEY (BB) REFERENCES A (AA);
MRE 慢 运行 示例日志:
2022-05-31 14:50:45 I main App INSERT A CUMULATIVE TIME: 108
2022-05-31 14:50:45 I main App INSERT B CUMULATIVE TIME: 4395
2022-05-31 14:50:45 I main App ROWS PER SEC: 3623
2022-05-31 14:50:58 I main App INSERT A CUMULATIVE TIME: 197
2022-05-31 14:50:58 I main App INSERT B CUMULATIVE TIME: 17046
2022-05-31 14:50:58 I main App ROWS PER SEC: 1896
MRE 快速 运行 示例日志:
2022-05-31 14:51:02 I main App INSERT A CUMULATIVE TIME: 100
2022-05-31 14:51:02 I main App INSERT B CUMULATIVE TIME: 269
2022-05-31 14:51:02 I main App ROWS PER SEC: 43115
2022-05-31 14:51:03 I main App INSERT A CUMULATIVE TIME: 177
2022-05-31 14:51:03 I main App INSERT B CUMULATIVE TIME: 509
2022-05-31 14:51:03 I main App ROWS PER SEC: 46811
我没有观察到模式的现象:
CREATE TABLE A (AA BIGINT NOT NULL, CONSTRAINT PK_AA PRIMARY KEY (AA));
CREATE TABLE B (BA BIGINT NOT NULL, CONSTRAINT PK_BA PRIMARY KEY (BA));
ALTER TABLE B ADD CONSTRAINT FK_BA_A FOREIGN KEY (BA) REFERENCES A (AA);
您的程序从不提交任何内容,因此每次您 运行 它 table 开始时都是空的。他们是否根本没有行,或者只是没有提交的行但有很多 in-progress/aborted 行,以及计划者是否知道这一点,将取决于最后一次清理或分析的时间 运行.
如果 table “a” 开始时真的是空的,规划器会认为验证约束的最有效方法是对 “a” 进行序列扫描,而不是索引扫描。开始时还不错,但随着它挖掘越来越多的自身问题,速度变得越来越慢,因此变得越来越慢。
我可以通过在单独的会话中手动 ANALYZE a,b
将其启动到快速模式,而第一个会话仍然是 运行ning(尽管这是否有效可能取决于版本--我认为在旧版本中,一个会话的 ANALYZE 会忽略另一个会话未提交的行)。
我使用 JDBC 插入 postgres。由于某种原因,性能似乎在 运行 之间有所不同。我观察到它在 2 个性能级别之间切换,“快”和慢 10 倍左右。
我已经用 postgres 14.2、14.3 和 13.6 测试了这个问题。它似乎与所有版本都相关。
这种行为的可能原因是什么?
MRE (java 17):
package test;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.sql.Connection;
import java.sql.DriverManager;
public class App {
private static final Logger log = LoggerFactory.getLogger(App.class);
public static void main(String[] args) throws Exception {
Class.forName("org.postgresql.Driver");
try (Connection connection = DriverManager.getConnection("jdbc:postgresql://172.17.0.3:5432/postgres", "postgres", "postgres")) {
connection.setAutoCommit(false);
var uploadedCount = 0;
final var insertA = connection.prepareStatement("insert into A(AA) values(?)");
final var insertB = connection.prepareStatement("insert into B(BA, BB) values(?, ?)");
final var startTime = System.currentTimeMillis();
var timeA = 0;
var timeB = 0;
var batchSize = 0;
var batchesCount = 0;
for (long i = 0; i < 16_384; i++) {
insertA.setObject(1, i);
insertA.addBatch();
insertB.setObject(1, i);
insertB.setObject(2, i);
insertB.addBatch();
batchSize++;
if (batchSize == 32) {
var start = System.currentTimeMillis();
insertA.executeBatch();
timeA += System.currentTimeMillis() - start;
start = System.currentTimeMillis();
insertB.executeBatch();
timeB += System.currentTimeMillis() - start;
uploadedCount += batchSize * 2;
batchSize = 0;
batchesCount++;
if (batchesCount % 256 == 0) {
log.info("INSERT A CUMULATIVE TIME: {}", timeA);
log.info("INSERT B CUMULATIVE TIME: {}", timeB);
log.info("ROWS PER SEC: {}", uploadedCount * 1000L / (System.currentTimeMillis() - startTime));
}
}
}
}
}
}
MRE 架构:
CREATE TABLE A (AA BIGINT NOT NULL, CONSTRAINT PK_AA PRIMARY KEY (AA));
CREATE TABLE B (BA BIGINT NOT NULL, BB BIGINT NOT NULL, CONSTRAINT PK_BA_BB PRIMARY KEY (BA, BB));
ALTER TABLE B ADD CONSTRAINT FK_BA_A FOREIGN KEY (BA) REFERENCES A (AA);
ALTER TABLE B ADD CONSTRAINT FK_BB_A FOREIGN KEY (BB) REFERENCES A (AA);
MRE 慢 运行 示例日志:
2022-05-31 14:50:45 I main App INSERT A CUMULATIVE TIME: 108
2022-05-31 14:50:45 I main App INSERT B CUMULATIVE TIME: 4395
2022-05-31 14:50:45 I main App ROWS PER SEC: 3623
2022-05-31 14:50:58 I main App INSERT A CUMULATIVE TIME: 197
2022-05-31 14:50:58 I main App INSERT B CUMULATIVE TIME: 17046
2022-05-31 14:50:58 I main App ROWS PER SEC: 1896
MRE 快速 运行 示例日志:
2022-05-31 14:51:02 I main App INSERT A CUMULATIVE TIME: 100
2022-05-31 14:51:02 I main App INSERT B CUMULATIVE TIME: 269
2022-05-31 14:51:02 I main App ROWS PER SEC: 43115
2022-05-31 14:51:03 I main App INSERT A CUMULATIVE TIME: 177
2022-05-31 14:51:03 I main App INSERT B CUMULATIVE TIME: 509
2022-05-31 14:51:03 I main App ROWS PER SEC: 46811
我没有观察到模式的现象:
CREATE TABLE A (AA BIGINT NOT NULL, CONSTRAINT PK_AA PRIMARY KEY (AA));
CREATE TABLE B (BA BIGINT NOT NULL, CONSTRAINT PK_BA PRIMARY KEY (BA));
ALTER TABLE B ADD CONSTRAINT FK_BA_A FOREIGN KEY (BA) REFERENCES A (AA);
您的程序从不提交任何内容,因此每次您 运行 它 table 开始时都是空的。他们是否根本没有行,或者只是没有提交的行但有很多 in-progress/aborted 行,以及计划者是否知道这一点,将取决于最后一次清理或分析的时间 运行.
如果 table “a” 开始时真的是空的,规划器会认为验证约束的最有效方法是对 “a” 进行序列扫描,而不是索引扫描。开始时还不错,但随着它挖掘越来越多的自身问题,速度变得越来越慢,因此变得越来越慢。
我可以通过在单独的会话中手动 ANALYZE a,b
将其启动到快速模式,而第一个会话仍然是 运行ning(尽管这是否有效可能取决于版本--我认为在旧版本中,一个会话的 ANALYZE 会忽略另一个会话未提交的行)。