提高 C# 应用程序插入 SQL 数据库的性能

Improve insert performance for a C# application into SQL database

我需要提高 C# 应用程序中插入的性能。我首先出去从视图中获取数据。然后我通过 FOREACH 循环插入 table。我正在处理超过 200,000 条记录,执行此任务需要花费大量时间。我知道 SaveChanges 是到数据库的往返,但我不确定如何解决这个问题。我可以做些什么来缩短时间吗?

            var values = db.TodaysAirs.ToList();
            foreach (TodaysAir x in values)
            {
                //check to see if this is a new value or one that needs to be updated
                var checkForNew = db.TodaysAirValues
                    .Where(m => m.ID == x.ID);

                //new record
                if (checkForNew.Count() == 0)
                {
                    TodaysAirValue newRecord = new TodaysAirValue();
                    newRecord.ID = x.ID;
                    newRecord.Logger_Id = x.Logger_Id;
                    newRecord.SiteName = x.SiteName;
                    newRecord.Latitude = x.Latitude;
                    newRecord.Longitude = x.Longitude;
                    newRecord.Hour = x.Hour;
                    newRecord.Parameter = x.Parameter;
                    newRecord.Stan = x.Stan;
                    newRecord.Units = x.Units;
                    newRecord.InstrumentType = x.InstrumentType;
                    newRecord.NowCast = x.NowCast;
                    newRecord.AQIValue = x.AQIValue;
                    newRecord.HealthCategory = x.HealthCategory;
                    newRecord.Hr24Avg = x.Hr24Avg;
                    newRecord.Hr24Max = x.Hr24Max;
                    newRecord.Hr24Min = x.Hr24Min;
                    newRecord.SID = DateTime.Now;

                    db.TodaysAirValues.Add(newRecord);
                    db.SaveChanges();
                  //  CallJenkinsJob();
                }
        }

目标应该是 运行 一个单一的原始 SQL 语句,看起来非常像这样:

INSERT INTO TodaysAirValues
    (ID, Logger_id, SiteName, Latitude, Longitude, Hour, Parameter,
     Stan, Units, InstrumentType, NowCast, AQIValue, HealthCategory,
     Hr24Avg, Hr24Max, Hr24Min, SID)

SELECT ta.ID, ta.Logger_id, ta.SiteName, ta.Latitude, ta.Longitude,
       ta.Hour, ta.Parameter, ta.Stan, ta.Units, ta.InstrumentType,
       ta.NowCast, ta.AQIValue, ta.HealthCategory, ta.Hr24Avg,
       ta.Hr24Max, ta.Hr24Min, current_timestamp
FROM TodaysAirs ta
LEFT JOIN TodaysAirValues tav ON tav.ID = ta.ID
WHERE tav.ID IS NULL

这可能不是所有的 table 或列名都完全正确,如果与数据库的 EF 映射有任何差异。您还可以使用 NOT EXISTS() 而不是 LEFT JOIN WHERE NULL 技术让它运行得更快。


我也看到了这个:

if the Count is greater than 0 it checks to see if any changes where made and if so update the record.

在那种情况下,如果您 precede (运行 这个额外的命令首先!)上面的 INSERT 和一个 UPDATE 看起来像这样:

UPDATE tav
   SET tav.ID=ta.DI, tav.Logger_id=ta.Logger_id, tav.SiteName=ta.SiteName,
       tav.Latitude=ta.Latitude, tav.Longitude=ta.Longitude, tav.Hour=ta.Hour, 
       tav.Parameter=ta.Parameter, tav.Stan=ta.Stan, tav.Units=ta.Units, 
       tav.InstrumentType=ta.InstrumentType, tav.NowCast=ta.NowCast,
       tav.AQIValue=ta.AQIValue, tav.HealthCategory=ta.HealthCategory,
       tav.Hr24Avg=ta.Hr24Avg,tav.Hr24Max=ta.Hr24Max, tav.Hr24Min=ta.Hr24Min,
       tav.SID=ta.SID -- possibly current_timestamp here instead
FROM TodaysAirs ta
INNER JOIN TodaysAirValues tav ON tav.ID = ta.ID
WHERE (
    -- compare here to decide if the record needs to update or not
)

很遗憾,我没有足够的信息来了解您想要的内容,无法为您提供完整的代码。