Eclipselink 和 Postgresql 批量编写

Eclipselink and Postgresql batch writing

我一直在为我的一个客户开发 BulkSMS 解决方案,我决定使用 JPA (Eclipselink) 作为 ORM,底层数据库是 PostgreSQL 9.5.1。

我的问题是,每当我发送一个包含 65,000 条要保留的记录的请求时,完成操作大约需要 27 秒。我决定实施序列池、序列预分配 =1000 和批写入,但这只设法从操作中删除 15 秒。

调查数据库日志后,我注意到在应用优化前后调用了相同的查询。

这是我优化后的 persistance.xml:

<persistence version="2.1" xmlns="http://xmlns.jcp.org/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/persistence http://xmlns.jcp.org/xml/ns/persistence/persistence_2_1.xsd">
<persistence-unit name="com.kw.ktt.sms.server" transaction-type="JTA">
    <jta-data-source>SMSDB</jta-data-source>
     <non-jta-data-source>sequence</non-jta-data-source> 
    <class>com.kw.ktt.sms.server.core.TestClass</class>
    <class>com.kw.ktt.sms.server.jpa.Customer</class>
    <class>com.kw.ktt.sms.server.jpa.SMSAccount</class>
    <class>com.kw.ktt.sms.server.jpa.SMSTransaction</class>
    <class>com.kw.ktt.sms.server.jpa.ContactGroup</class>
    <class>com.kw.ktt.sms.server.jpa.PhoneNumber</class>
    <properties>
        <property name="eclipselink.application-location" value="/Users/mousaalsulaimi/Desktop"/>
        <property name="eclipselink.ddl-generation.output-mode" value="database"/>
        <property name="eclipselink.logging.connection" value="true"/>
        <property name="javax.persistence.schema-generation.database.action" value="drop-and-create"/>
        <property name="eclipselink.ddl-generation" value="drop-and-create-tables"/>
        <property name="eclipselink.jdbc.batch-writing" value="JDBC" /> 
        <property name="eclipselink.jdbc.batch-writing.size" value="1000"/> 
        <property name="eclipselink.jdbc.sequence-connection-pool"  value="true" />
        <property name="eclipselink.connection-pool.sequence.nonJtaDataSource" value="sequence"/>
        <property name="eclipselink.connection-pool.sequence.intial" value="1000" /> 
    </properties>
</persistence-unit>

如上所述,我使用 JTA 连接池进行持久化(称为 SMSDB),使用非 JTA 连接进行排序(称为序列),每个连接池都有不同的数据库用户,以便轻松跟踪数据库日志的连接。

未优化连接的日志are here - 这只是 10 条记录的示例。

优化连接的日志are here - 这只是 10 条记录的示例。

有人可以向我解释我做错了什么以及为什么两个持久性设置产生相同的查询,即使实际改进了 15 秒。

还有一件事,我在 Entitiy 的源代码中将序列预分配设置为 1000,从数据库日志排序来看,排序按预期工作并且正在获取正确的增量值。我关心的是批量写入,我担心它在 persistence.xml

中设置不正确

更新

我已经按照 Chris 的建议在 eclipse link 中启用了日志记录, 这是使用优化 persistence.xml

时生成的 eclipseLink 日志
2016-02-27T23:59:28.307+0300|Fine: SELECT CUSTOMERID, CIVILIDNUMBER, CREATEDATE, CREATEDBY, EMAIL, FULLNAME, ISACTIVE, ISADMIN, MALE, PASSWORD, PERSONAL, PHONENUMBER, STATUS, USERNAME, ACCOUNT_SMSACCOUNTID FROM CUSTOMER WHERE (CUSTOMERID = ?)
    bind => [1 parameter bound]
2016-02-27T23:59:28.310+0300|Fine: SELECT SMSACCOUNTID, OOOREDOOO_BALANCE, VIVA_BALANCE, ZAIN_BALANCE FROM SMSACCOUNT WHERE (SMSACCOUNTID = ?)
    bind => [1 parameter bound]
2016-02-27T23:59:28.312+0300|Fine: select nextval('SEQ_GEN_SEQUENCE')
2016-02-27T23:59:28.327+0300|Fine: select nextval('number_seq')
2016-02-27T23:59:28.331+0300|Info: this is the id 1
2016-02-27T23:59:28.332+0300|Fine: INSERT INTO SMSACCOUNT (SMSACCOUNTID, OOOREDOOO_BALANCE, VIVA_BALANCE, ZAIN_BALANCE) VALUES (?, ?, ?, ?)
    bind => [4 parameters bound]
2016-02-27T23:59:28.335+0300|Fine: INSERT INTO CUSTOMER (CUSTOMERID, CIVILIDNUMBER, CREATEDATE, CREATEDBY, EMAIL, FULLNAME, ISACTIVE, ISADMIN, MALE, PASSWORD, PERSONAL, PHONENUMBER, STATUS, USERNAME, ACCOUNT_SMSACCOUNTID) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
    bind => [15 parameters bound]
2016-02-27T23:59:28.337+0300|Fine: INSERT INTO CONTACTGROUP (GROUPID, CREATEBY, CREATEDATE, GROUPDESCRIPTION, GROUPNAME) VALUES (?, ?, ?, ?, ?)
    bind => [5 parameters bound]
2016-02-27T23:59:28.339+0300|Fine: INSERT INTO PHONENUMBER (NUMBERID, OPERATOR, PHONENUMBER) VALUES (?, ?, ?)
2016-02-27T23:59:28.339+0300|Fine: bind => [3 parameters bound]
2016-02-27T23:59:28.339+0300|Fine: bind => [3 parameters bound]
2016-02-27T23:59:28.339+0300|Fine: bind => [3 parameters bound]
2016-02-27T23:59:28.339+0300|Fine: bind => [3 parameters bound]
2016-02-27T23:59:28.339+0300|Fine: bind => [3 parameters bound]
2016-02-27T23:59:28.339+0300|Fine: bind => [3 parameters bound]
2016-02-27T23:59:28.339+0300|Fine: bind => [3 parameters bound]
2016-02-27T23:59:28.340+0300|Fine: bind => [3 parameters bound]
2016-02-27T23:59:28.340+0300|Fine: bind => [3 parameters bound]
2016-02-27T23:59:28.340+0300|Fine: bind => [3 parameters bound]
2016-02-27T23:59:28.342+0300|Fine: UPDATE CONTACTGROUP SET customerID = ? WHERE (GROUPID = ?)
    bind => [2 parameters bound]
2016-02-27T23:59:28.343+0300|Fine: UPDATE PHONENUMBER SET groupId = ? WHERE (NUMBERID = ?)
2016-02-27T23:59:28.344+0300|Fine: bind => [2 parameters bound]
2016-02-27T23:59:28.344+0300|Fine: bind => [2 parameters bound]
2016-02-27T23:59:28.344+0300|Fine: bind => [2 parameters bound]
2016-02-27T23:59:28.344+0300|Fine: bind => [2 parameters bound]
2016-02-27T23:59:28.344+0300|Fine: bind => [2 parameters bound]
2016-02-27T23:59:28.344+0300|Fine: bind => [2 parameters bound]
2016-02-27T23:59:28.344+0300|Fine: bind => [2 parameters bound]
2016-02-27T23:59:28.344+0300|Fine: bind => [2 parameters bound]
2016-02-27T23:59:28.344+0300|Fine: bind => [2 parameters bound]
2016-02-27T23:59:28.344+0300|Fine: bind => [2 parameters bound]

及以下使用原始 persistence.xml

时生成的 eclipse link 日志
2016-02-28T08:56:25.440+0300|Fine: SELECT CUSTOMERID, CIVILIDNUMBER, CREATEDATE, CREATEDBY, EMAIL, FULLNAME, ISACTIVE, ISADMIN, MALE, PASSWORD, PERSONAL, PHONENUMBER, STATUS, USERNAME, ACCOUNT_SMSACCOUNTID FROM CUSTOMER WHERE (CUSTOMERID = ?)
    bind => [1 parameter bound]
2016-02-28T08:56:25.443+0300|Fine: SELECT SMSACCOUNTID, OOOREDOOO_BALANCE, VIVA_BALANCE, ZAIN_BALANCE FROM SMSACCOUNT WHERE (SMSACCOUNTID = ?)
    bind => [1 parameter bound]
2016-02-28T08:56:25.445+0300|Fine: select nextval('SEQ_GEN_SEQUENCE')
2016-02-28T08:56:25.447+0300|Fine: select nextval('number_seq')
2016-02-28T08:56:25.449+0300|Info: this is the id 1
2016-02-28T08:56:25.450+0300|Fine: INSERT INTO SMSACCOUNT (SMSACCOUNTID, OOOREDOOO_BALANCE, VIVA_BALANCE, ZAIN_BALANCE) VALUES (?, ?, ?, ?)
    bind => [4 parameters bound]
2016-02-28T08:56:25.451+0300|Fine: INSERT INTO CUSTOMER (CUSTOMERID, CIVILIDNUMBER, CREATEDATE, CREATEDBY, EMAIL, FULLNAME, ISACTIVE, ISADMIN, MALE, PASSWORD, PERSONAL, PHONENUMBER, STATUS, USERNAME, ACCOUNT_SMSACCOUNTID) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
    bind => [15 parameters bound]
2016-02-28T08:56:25.452+0300|Fine: INSERT INTO CONTACTGROUP (GROUPID, CREATEBY, CREATEDATE, GROUPDESCRIPTION, GROUPNAME) VALUES (?, ?, ?, ?, ?)
    bind => [5 parameters bound]
2016-02-28T08:56:25.452+0300|Fine: INSERT INTO PHONENUMBER (NUMBERID, OPERATOR, PHONENUMBER) VALUES (?, ?, ?)
    bind => [3 parameters bound]
2016-02-28T08:56:25.453+0300|Fine: INSERT INTO PHONENUMBER (NUMBERID, OPERATOR, PHONENUMBER) VALUES (?, ?, ?)
    bind => [3 parameters bound]
2016-02-28T08:56:25.453+0300|Fine: INSERT INTO PHONENUMBER (NUMBERID, OPERATOR, PHONENUMBER) VALUES (?, ?, ?)
    bind => [3 parameters bound]
2016-02-28T08:56:25.454+0300|Fine: INSERT INTO PHONENUMBER (NUMBERID, OPERATOR, PHONENUMBER) VALUES (?, ?, ?)
    bind => [3 parameters bound]
2016-02-28T08:56:25.454+0300|Fine: INSERT INTO PHONENUMBER (NUMBERID, OPERATOR, PHONENUMBER) VALUES (?, ?, ?)
    bind => [3 parameters bound]
2016-02-28T08:56:25.454+0300|Fine: INSERT INTO PHONENUMBER (NUMBERID, OPERATOR, PHONENUMBER) VALUES (?, ?, ?)
    bind => [3 parameters bound]
2016-02-28T08:56:25.455+0300|Fine: INSERT INTO PHONENUMBER (NUMBERID, OPERATOR, PHONENUMBER) VALUES (?, ?, ?)
    bind => [3 parameters bound]
2016-02-28T08:56:25.455+0300|Fine: INSERT INTO PHONENUMBER (NUMBERID, OPERATOR, PHONENUMBER) VALUES (?, ?, ?)
    bind => [3 parameters bound]
2016-02-28T08:56:25.455+0300|Fine: INSERT INTO PHONENUMBER (NUMBERID, OPERATOR, PHONENUMBER) VALUES (?, ?, ?)
    bind => [3 parameters bound]
2016-02-28T08:56:25.456+0300|Fine: INSERT INTO PHONENUMBER (NUMBERID, OPERATOR, PHONENUMBER) VALUES (?, ?, ?)
    bind => [3 parameters bound]
2016-02-28T08:56:25.456+0300|Fine: UPDATE CONTACTGROUP SET customerID = ? WHERE (GROUPID = ?)
    bind => [2 parameters bound]
2016-02-28T08:56:25.457+0300|Fine: UPDATE PHONENUMBER SET groupId = ? WHERE (NUMBERID = ?)
    bind => [2 parameters bound]
2016-02-28T08:56:25.457+0300|Fine: UPDATE PHONENUMBER SET groupId = ? WHERE (NUMBERID = ?)
    bind => [2 parameters bound]
2016-02-28T08:56:25.458+0300|Fine: UPDATE PHONENUMBER SET groupId = ? WHERE (NUMBERID = ?)
    bind => [2 parameters bound]
2016-02-28T08:56:25.458+0300|Fine: UPDATE PHONENUMBER SET groupId = ? WHERE (NUMBERID = ?)
    bind => [2 parameters bound]
2016-02-28T08:56:25.459+0300|Fine: UPDATE PHONENUMBER SET groupId = ? WHERE (NUMBERID = ?)
    bind => [2 parameters bound]
2016-02-28T08:56:25.459+0300|Fine: UPDATE PHONENUMBER SET groupId = ? WHERE (NUMBERID = ?)
    bind => [2 parameters bound]
2016-02-28T08:56:25.460+0300|Fine: UPDATE PHONENUMBER SET groupId = ? WHERE (NUMBERID = ?)
    bind => [2 parameters bound]
2016-02-28T08:56:25.460+0300|Fine: UPDATE PHONENUMBER SET groupId = ? WHERE (NUMBERID = ?)
    bind => [2 parameters bound]
2016-02-28T08:56:25.460+0300|Fine: UPDATE PHONENUMBER SET groupId = ? WHERE (NUMBERID = ?)
    bind => [2 parameters bound]
2016-02-28T08:56:25.461+0300|Fine: UPDATE PHONENUMBER SET groupId = ? WHERE (NUMBERID = ?)
    bind => [2 parameters bound]

显然,使用优化后的 persistence.xml 和原始 persistence.xml 时产生的查询之间存在很大差异。

打开 EclipseLink 的 SQL 日志记录,您应该会看到 JDBC 中语句的准备和处理方式有所不同,这应该说明为什么会有 15 秒的差异。

我不熟悉 eclipselink.connection-pool.sequence.intial 属性 - 您应该使用的是序列生成器本身中的 allocationSize 配置,以允许一次获取 1000 个序列。

如果不设置,批量写入会减少插入语句的数量,但是你仍然会看到大量的语句获取序列号,但是在不同的连接上——排序是使用自己的连接池。