C#,Entity Framework Core & PostgreSql:插入一行需要 20 多秒

C#, Entity Framework Core & PostgreSql : inserting a single row takes 20+ seconds

我正在使用 Entity Framework Core 和 nuget 包 Npgsql.EntityFrameworkCore.PostgreSQL

我已经阅读了所有其他关于 Entity Framework Core 缓慢插入的答案,但 none 有所帮助。

using (var db = getNewContext())
{
     db.Table1.Add(Table1Object);
     db.SaveChanges();
}

这个单次插入大约需要 20 到 30 秒。 table 中少于 100 行。我在 using 中放置了秒表启动和停止,以确保时间不是由于上下文初始化引起的。

这是我的 class table 对象(相关 属性 名称已更改):

public partial class Table1Object
{
    public long Id { get; set; }
    public Guid SessionId { get; set; }
    public DateTime Timestamp { get; set; }
    public long MyNumber1 { get; set; }
    public double MyNumber2 { get; set; }
    public double MyNumber3 { get; set; }
    public double MyNumber4 { get; set; }
    public long? ParentId { get; set; }
    public bool MyBool { get; set; }
}

SessionId 用于 link 到另一个 table (Session table),但我没有在任何地方明确定义外键或任何其他约束. ParentId 也用于 link 回到同一个 table 中的另一行,但我没有为此明确定义约束。

运行 不同 table 上的等效代码只需不到一秒的时间即可插入一行。 Table2 的列数较少,但我不会想到行大小如此不同会产生如此剧烈的效果:

public partial class Table2Object
{
    public int Id { get; set; }
    public DateTime Timestamp { get; set; }
    public string Name { get; set; }
    public double Value { get; set; }
}

使用 Serilog 和 Entity Framework 核心日志记录,您可以看到延迟在“提交 t运行saction”步骤中,大约需要 26 秒,插入本身只需要 6ms(某些部分为简洁起见删减了日志语句的数量):

2021-04-08 11:20:36.874 [DBG] 'DataContext' generated a temporary value for the property 'Id.Table1'.
2021-04-08 11:20:36.879 [DBG] Context 'DataContext' started tracking 'Table1' entity.
2021-04-08 11:20:36.880 [DBG] SaveChanges starting for 'DataContext'.
2021-04-08 11:20:36.881 [DBG] DetectChanges starting for 'DataContext'.
2021-04-08 11:20:36.905 [DBG] DetectChanges completed for 'DataContext'.
2021-04-08 11:20:36.906 [DBG] Opening connection to database
2021-04-08 11:20:36.907 [DBG] Opened connection to database
2021-04-08 11:20:36.908 [DBG] Beginning transaction with isolation level 'Unspecified'.
2021-04-08 11:20:36.909 [DBG] Began transaction with isolation level 'ReadCommitted'.
2021-04-08 11:20:36.912 [DBG] Creating DbCommand for 'ExecuteReader'.
2021-04-08 11:20:36.913 [DBG] Created DbCommand for 'ExecuteReader' (0ms).
2021-04-08 11:20:36.914 [DBG] Executing DbCommand [Parameters= ...]
INSERT INTO "Table1" ("SessionId", "Timestamp" ...)
VALUES (@p0, @p1, @p2, @p3, @p4, @p5, @p6, @p7)
RETURNING "Id";
2021-04-08 11:20:36.920 [INF] Executed DbCommand (6ms) Parameters=[...]
INSERT INTO "Table1" ("SessionId", "Timestamp" ...)
VALUES (@p0, @p1, @p2, @p3, @p4, @p5, @p6, @p7)
RETURNING "Id";
2021-04-08 11:20:36.925 [DBG] The foreign key property 'Table1.Id' was detected as changed.
2021-04-08 11:20:36.930 [DBG] A data reader was disposed.
2021-04-08 11:20:36.931 [DBG] Committing transaction.
2021-04-08 11:21:02.729 [DBG] Committed transaction.
2021-04-08 11:21:02.730 [DBG] Closing connection to database

这是插入到 Table2 时的等效日志。插入需要 3 毫秒,提交需要 75 毫秒。这是应该:

的速度
2021-04-08 11:20:36.459 [DBG] 'DataContext' generated a temporary value for the property 'Id.Table2'.
2021-04-08 11:20:36.460 [DBG] Context 'DataContext' started tracking 'Table2' entity.
2021-04-08 11:20:36.461 [DBG] SaveChanges starting for 'DataContext'.
2021-04-08 11:20:36.462 [DBG] DetectChanges starting for 'DataContext'.
2021-04-08 11:20:36.463 [DBG] DetectChanges completed for 'DataContext'.
2021-04-08 11:20:36.464 [DBG] Opening connection to database
2021-04-08 11:20:36.465 [DBG] Opened connection to database
2021-04-08 11:20:36.466 [DBG] Beginning transaction with isolation level 'Unspecified'.
2021-04-08 11:20:36.467 [DBG] Began transaction with isolation level 'ReadCommitted'.
2021-04-08 11:20:36.468 [DBG] Creating DbCommand for 'ExecuteReader'.
2021-04-08 11:20:36.469 [DBG] Created DbCommand for 'ExecuteReader' (0ms).
2021-04-08 11:20:36.470 [DBG] Executing DbCommand [Parameters=...]
INSERT INTO "Table2" ("Name", "Timestamp", "Value")
VALUES (@p0, @p1, @p2)
RETURNING "Id";
2021-04-08 11:20:36.472 [INF] Executed DbCommand (3ms) [Parameters=[...]
INSERT INTO "Table2" ("Name", "Timestamp", "Value")
VALUES (@p0, @p1, @p2)
RETURNING "Id";
2021-04-08 11:20:36.474 [DBG] The foreign key property 'Table2.Id' was detected as changed.
2021-04-08 11:20:36.475 [DBG] A data reader was disposed.
2021-04-08 11:20:36.476 [DBG] Committing transaction.
2021-04-08 11:20:36.551 [DBG] Committed transaction.
2021-04-08 11:20:36.552 [DBG] Closing connection to database

我不知道 table 除了行大小稍大之外还有什么不同。我删除并重新创建了 table 以防有任何我不知道的约束、外键、触发器等。

插入的“解释”计划生成:

"Insert on ""Table1""  (cost=0.00..0.01 rows=1 width=81)"
"  ->  Result  (cost=0.00..0.01 rows=1 width=81)"

为 postgresql 启用“显示查询日志”提供与 entity framework 日志记录相同数量的信息:

2021-04-09 12:05:06.559 BST [1979] user1@database LOG:  statement: BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED
2021-04-09 12:05:06.560 BST [1979] user1@database LOG:  execute <unnamed>: INSERT INTO "Table1" (...)
    VALUES (, , , , , , , )
    RETURNING "Id"
2021-04-09 12:05:06.560 BST [1979] user1@database DETAIL:  parameters:  = '0.580484961751977',  = 'f',  = '0.205387434417341',  = '18',  = '148',  = '93c71fb5-836a-486a-8d82-e073743b41cd',  = '2021-04-09 11:04:58.123773',  = '1.15474773024298'
2021-04-09 12:05:06.565 BST [1979] user1@database LOG:  statement: COMMIT
2021-04-09 12:05:47.352 BST [1443] postgres@database LOG:  statement: /*pga4dash*/
    SELECT 'session_stats' AS chart_name, row_to_json(t) AS chart_data
    FROM ...
    UNION ALL
    SELECT 'tps_stats' AS chart_name, row_to_json(t) AS chart_data
    FROM ...
    UNION ALL
    SELECT 'ti_stats' AS chart_name, row_to_json(t) AS chart_data
    FROM ...
    UNION ALL
    SELECT 'to_stats' AS chart_name, row_to_json(t) AS chart_data
    FROM ...
    UNION ALL
    SELECT 'bio_stats' AS chart_name, row_to_json(t) AS chart_data
    FROM ...
    
2021-04-09 12:05:51.148 BST [1979] user1@database LOG:  statement: DISCARD ALL

您可以看到,在 COMMIT 语句之后,在下一个语句执行一些内部图表日志信息之前大约过了 41 秒。 41 秒仅用于提交单行插入!

将此与 Table2 的插入进行比较,提交仅需 100 毫秒!

2021-04-09 12:05:06.097 BST [1979] user1@database LOG:  statement: BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED
2021-04-09 12:05:06.097 BST [1979] user1@database LOG:  execute <unnamed>: INSERT INTO "Table2" ("Name", "Timestamp", "Value")
    VALUES (, , )
    RETURNING "Id"
2021-04-09 12:05:06.097 BST [1979] user1@database DETAIL:  parameters:  = 'Test',  = '2021-04-09 11:05:06.096182',  = '98'
2021-04-09 12:05:06.098 BST [1979] user1@database LOG:  statement: COMMIT
2021-04-09 12:05:06.189 BST [1979] user1@database LOG:  statement: DISCARD ALL

我直接在PGAdmin中运行下面的语句,它告诉我用了323ms:

BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;
INSERT INTO "Table1" ("MyColumn1", "MyColumn2", "MyColumn3", "MyColumn4", "ParentId", "SessionId", "Timestamp", "MyColumn5")
    VALUES ('0.580484961751977','f' , '0.205387434417341','18',  '148',  '93c71fb5-836a-486a-8d82-e073743b41cd','2021-04-09 11:04:58.123773',  '1.15474773024298')
    RETURNING "Id";
COMMIT;

我也试过 运行 直接使用 NpgSql 和以下 C# 代码的语句:

            _logger.Debug("Using connection");
            using (var conn = new NpgsqlConnection(StaticConfig.ConnectionString))
            {
                _logger.Debug("connection.open");
                conn.Open();
                _logger.Debug("Using command");
                // Insert some data
                using (var cmd = new NpgsqlCommand(
                    " BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;" +
                    " INSERT INTO \"Table1\" (\"MyColumn1\", \"MyColumn2\", \"MyColumn3\", \"MyColumn4\", \"ParentId\", \"SessionId\", \"Timestamp\", \"MyColumn5\")" +
                    " VALUES ('0.580484961751977','f' , '0.205387434417341','18',  '148',  '93c71fb5-836a-486a-8d82-e073743b41cd','2021-04-09 11:04:58.123773',  '1.15474773024298')" +
                    " RETURNING \"Id\";" +
                    "COMMIT;"
                    , conn))
                {
                    _logger.Debug("command execute");
                    cmd.ExecuteNonQuery();
                }
            }
            _logger.Debug("Done");

该代码中的日志语句告诉我整个过程不到一秒钟:

[21:59:41 DBG] Using connection
[21:59:41 DBG] connection.open
[21:59:42 DBG] Using command
[21:59:42 DBG] command execute
[21:59:42 DBG] Done

我还删除了数据库,删除了 Entity Framework 中的所有迁移,并创建了一个新的 Initial create 迁移,所以一切都是 运行 从头开始​​,仍然需要大约 20 秒才能完成插入 Table1,但插入 Table2 不到一秒。

在连接字符串中输入 Enlist=false 没有帮助。

我同意@Mark G 的评论,即“调查结果……表明问题要么出在 EF Core 的上游,要么出在提供程序中”,但我不确定如何进一步诊断问题。

我已经更改代码以使用 NpgSql 通过原始 sql 将行插入此 table,这非常快,每次插入不到 100 毫秒。所以最有可能的候选人似乎是 Entity Framework Core 中的错误,但由于我不知道具体问题是什么,因此很难向他们的团队提出错误报告。

每次保存数据时,您是否尝试导入 getNewContext 而不是创建新实例??

private getNewContext _context;
       
        
        public RaceService( getNewContext context)
        {
          
            _context = context;
        }


public  Task<ReturnObject> MethodName()
        {
                        
               
                _context.Table1.Add(Table1Object);
                _context.SaveChange();
               
        }

我可以看到 Table1Object 和 Table2Object 之间的主要区别是存在 XxxId 属性。

SessionId is used to link to another table (Session table), but I have not explicitly defined a foreign key or any other constraints for this anywhere. ParentId is also used to link back to another row in the same table, but I have not explicitly defined a constraint for this.

EF 核心识别此模式,并且根据您的其他表,它可以按约定创建关系,例如,如果其他 table/entity 具有到您的 Table1Object 的导航 属性:

https://docs.microsoft.com/en-us/ef/core/modeling/relationships?tabs=fluent-api%2Cfluent-api-simple-key%2Csimple-key#conventions

有了这个,我做出了一个有根据的猜测:Serilog 和 EF Core 一起遇到了一些问题。我确定这是暂时的或已经修复的,但我会尝试从等式中删除 serilog:

https://github.com/serilog/serilog-sinks-seq/issues/98

经过大量测试,我最终发现问题根本不在 Entity framework 或 NpgSql 中,但我看到的延迟是由写入缓存引起的。在将一行插入 table 1 之前,我总是写一个 30MB 的文件,我相信文件写入是在 File.WriteAllBytes 返回之后完成的,因此它不会影响任何未来的计时语句。然而,在 OS 层,当插入语句 运行 时,它并没有真正完成写入磁盘,导致插入语句被人为延迟。

我用下面的代码证明了这一点:

Stopwatch sw1 = new Stopwatch();
sw1.Start();
File.WriteAllBytes(myBytes);
sw1.Stop();

Thread.Sleep(1000);

Stopwatch sw2 = new Stopwatch();
sw2.Start();
MethodThatInsertsIntoTable1();
sw2.Stop();

秒表 1 显示 File.WriteAllBytes 总是花费大约 500 毫秒,然后秒表 2 计时大约 20 到 30 秒。

如果我将 MethodThatInsertsIntoTable1 更改为插入到不同的 table,那么不管 table.

仍然需要 20 到 30 秒

如果我将 Thread.Sleep(1000) 增加到 Thread.Sleep(30000),则秒表 2 记录插入时间少于 10 毫秒。

这表明即使File.WriteAllBytes returns控制了程序,实际上并没有真正完成将文件写入磁盘。

我 运行 所处的环境是 linux raspberry pi。写入速度测试证实我对 SD 卡的写入速度刚刚超过 1MB/s,这与我看到的结果一致,写入 30MB 文件需要 20-30 秒,不可能在 500 毫秒内完成那个秒表 1 说是。

File.WriteAllBytes does not block

中的另一位用户似乎因此遇到了问题

将外部 SSD USB HDD 添加到 raspberry pi 并更改为在那里保存文件后,保存文件只需要 0.5 秒,问题就消失了。