DataAdapter.Update() 表现

DataAdapter.Update() performance

我有一个相对简单的例程,它查看媒体文件的数据库条目,计算宽度、高度和文件大小,然后将它们写回数据库。

数据库是 SQLite,使用 System.Data.SQLite 库,处理 ~4000 行。我将所有行加载到 ADO table 中,用新值更新 rows/columns,然后 运行 adapter.Update(table);就可以了。

从数据库加载数据集 table 半秒左右,使用图像更新所有行 width/height 并从 FileInfo 获取文件长度可能需要 30 秒。很好

adapter.Update(table);命令花了大约 5 到 7 分钟才到达 运行。

这似乎太过分了。 ID 是一个 PK INTEGER,因此 - 根据 SQLite 的文档,它是固有索引的,但即便如此我还是忍不住想,如果我要 运行 为每个单独的更新创建一个单独的更新命令,这将有完成得更快。

我认为 ADO/adapters 是相对较低的水平(无论如何与 ORM 相对),这种糟糕的表现让我感到惊讶。谁能阐明为什么需要 5-7 分钟才能针对本地放置的 SQLite 数据库更新一批约 4000 条记录?

顺便说一句,有什么方法可以 "peek into" ADO 如何处理这个问题?内部库步骤或...??

谢谢

public static int FillMediaSizes() {
        // returns the count of records updated

        int recordsAffected = 0;

        DataTable table = new DataTable();
        SQLiteDataAdapter adapter = new SQLiteDataAdapter();

        using (SQLiteConnection conn = new SQLiteConnection(Globals.Config.dbAppNameConnectionString))
        using (SQLiteCommand cmdSelect = new SQLiteCommand())
        using (SQLiteCommand cmdUpdate = new SQLiteCommand()) {

            cmdSelect.Connection = conn;
            cmdSelect.CommandText =
                "SELECT ID, MediaPathCurrent, MediaWidth, MediaHeight, MediaFilesizeBytes " +
                "FROM Media " +
                "WHERE MediaType = 1 AND (MediaWidth IS NULL OR MediaHeight IS NULL OR MediaFilesizeBytes IS NULL);";

            cmdUpdate.Connection = conn;
            cmdUpdate.CommandText =
                "UPDATE Media SET MediaWidth = @w, MediaHeight = @h, MediaFilesizeBytes = @b WHERE ID = @id;";

            cmdUpdate.Parameters.Add("@w", DbType.Int32, 4, "MediaWidth");
            cmdUpdate.Parameters.Add("@h", DbType.Int32, 4, "MediaHeight");
            cmdUpdate.Parameters.Add("@b", DbType.Int32, 4, "MediaFilesizeBytes");
            SQLiteParameter param = cmdUpdate.Parameters.Add("@id", DbType.Int32);
            param.SourceColumn = "ID";
            param.SourceVersion = DataRowVersion.Original;

            adapter.SelectCommand = cmdSelect;
            adapter.UpdateCommand = cmdUpdate;

            try {
                conn.Open();
                adapter.Fill(table);
                conn.Close();
            }
            catch (Exception e) {
                Core.ExceptionHandler.HandleException(e, true);
                throw new DatabaseOperationException("", e);
            }

            foreach (DataRow row in table.Rows) {

                try {

                    using (System.Drawing.Image img = System.Drawing.Image.FromFile(row["MediaPathCurrent"].ToString())) {

                        System.IO.FileInfo fi;

                        fi = new System.IO.FileInfo(row["MediaPathCurrent"].ToString());

                        if (img != null) {

                            int width = img.Width;
                            int height = img.Height;
                            long length = fi.Length;

                            row["MediaWidth"] = width;
                            row["MediaHeight"] = height;
                            row["MediaFilesizeBytes"] = (int)length;
                        }
                    }
                }
                catch (Exception e) {
                    Core.ExceptionHandler.HandleException(e);
                    DevUtil.Print(e);
                    continue;
                }
            }                


            try {
                recordsAffected = adapter.Update(table);
            }
            catch (Exception e) {
                Core.ExceptionHandler.HandleException(e);
                throw new DatabaseOperationException("", e);
            }


        }

        return recordsAffected;
    }

Loading the dataset from the db tables half a second or so

这是一个 SQL 语句(所以速度很快)。执行 SQL SELECT,填充数据集,完成。

updating all the rows with image width/height and getting the file length from FileInfo took maybe 30 seconds. Fine.

这是在更新内存中的数据(所以速度也很快),更改数据集中的 x 行,根本不要与 SQL 对话。

The adapter.Update(table); command took somewhere in the vicinity of 5 to 7 minutes to run.

这将为每个更新的行运行 SQL 更新。这就是为什么它很慢。

yet even so I can't help but think that if I were to run a separate update command for each individual update, this would have completed much faster.

这基本上就是它正在做的事情!


来自MSDN

The update is performed on a by-row basis. For every inserted, modified, and deleted row, the Update method determines the type of change that has been performed on it (Insert, Update or Delete). Depending on the type of change, the Insert, Update, or Delete command template executes to propagate the modified row to the data source. When an application calls the Update method, the DataAdapter examines the RowState property, and executes the required INSERT, UPDATE, or DELETE statements iteratively for each row, based on the order of the indexes configured in the DataSet.


is there some way to "peek into" how ADO is processing this?

是:Debug .NET Framework Source Code in Visual Studio 2012?

使用 Connection.BeginTransaction() 加速 DataAdapter 更新。

conn.Open() 'open connection
Dim myTrans As SQLiteTransaction
myTrans = conn.BeginTransaction() 
'Associate the transaction with the select command object of the DataAdapter
objDA.SelectCommand.Transaction = myTrans 

objDA.Update(objDT)

Try
    myTrans.Commit()
Catch ex As Exception
    myTrans.Rollback()
End Try
conn.Close()

这大大加快了更新速度。