从 db 读取 200MB 文件抛出内存不足异常

Reading 200MB file from db throws Out Of Memory Exception

我正在尝试查询数据库并提取 excel 可能大到 100 万行的文件 (~200MB) 存储为 varbinary并通过验证器。

我们的构建服务器有 6GB 内存和一个负载平衡的处理器,并且在运行时远未达到 CPU 或内存最大化。

然而,大约 40 秒后,进程抛出 OutOfMemoryException

这是堆栈跟踪:

System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
   at System.Data.SqlTypes.SqlBinary.get_Value()
   at System.Data.SqlClient.SqlBuffer.get_ByteArray()
   at System.Data.SqlClient.SqlBuffer.get_Value()
   at System.Data.SqlClient.SqlDataReader.GetValueFromSqlBufferInternal(SqlBuffer data, _SqlMetaData metaData)
   at System.Data.SqlClient.SqlDataReader.GetValueInternal(Int32 i)
   at System.Data.SqlClient.SqlDataReader.GetValue(Int32 i)
   at System.Data.SqlClient.SqlCommand.CompleteExecuteScalar(SqlDataReader ds, Boolean returnSqlValue)
   at System.Data.SqlClient.SqlCommand.ExecuteScalar()
   at eConfirmations.DataService.FileServices.FileDataService.GetFileContent(Guid fileId) in d:\w1\s\Source\eConfirmations.DataService\FileServices\FileDataService.cs:line 157
...
   at System.Data.SqlTypes.SqlBinary.get_Value()
   at System.Data.SqlClient.SqlBuffer.get_ByteArray()
   at System.Data.SqlClient.SqlBuffer.get_Value()
   at System.Data.SqlClient.SqlDataReader.GetValueFromSqlBufferInternal(SqlBuffer data, _SqlMetaData metaData)
   at System.Data.SqlClient.SqlDataReader.GetValueInternal(Int32 i)
   at System.Data.SqlClient.SqlDataReader.GetValue(Int32 i)
   at System.Data.SqlClient.SqlCommand.CompleteExecuteScalar(SqlDataReader ds, Boolean returnSqlValue)
   at System.Data.SqlClient.SqlCommand.ExecuteScalar()
   at eConfirmations.DataService.FileServices.FileDataService.GetFileContent(Guid fileId) in d:\w1\s\Source\eConfirmations.DataService\FileServices\FileDataService.cs:line 157

这是我抛出异常的代码:

    private byte[] GetFileContent(Guid fileId)
    {
        byte[] content;
        string connectionString = ConfigurationManager.ConnectionStrings["eConfirmationsDatabase"].ConnectionString;

        using (SqlConnection sqlConnection = new SqlConnection(connectionString))
        {
            using (SqlCommand sqlCommand = sqlConnection.CreateCommand())
            {
                sqlCommand.CommandTimeout = 300;
                sqlCommand.CommandText = $"SELECT Content FROM dbo.[File] WHERE FileId = '{fileId}'";
                sqlConnection.Open();
                content = sqlCommand.ExecuteScalar() as byte[];
                sqlConnection.Close();
                sqlCommand.Dispose();
            }
            sqlConnection.Dispose();
        }
        return content;
    }

是否有更有效的方法来提取此数据,或者我们可以更新构建服务器上的设置以避免此错误吗?

好的,这是正在发生的事情:

因为这是 运行 在 32 位版本上,最大内存分配是 2GB,但我离这个阈值还很远。

根据与我的情况非常相似的 this Whosebug post,.NET 框架将对象限制在内存中 256MB

因此,即使我的文件只有 200MB,byte[]s 和 MemoryStreams 也会按 2 的幂扩展,直到达到所需的 256MB。当它们扩展时,它们会创建一个适当大小的新实例并将旧数据复制到新数据,有效地将内存使用量乘以 3,这会导致异常。

MSDN has an example of how to retrieve a large file using a FileStream, but instead of a FileStream, I use a static byte[] pre-initialized to the size of my data using this post.

这是我的最终解决方案:

    public File GetFileViaFileIdGuid(Guid fileId)
    {
        File file = new File();
        string connectionString = ConfigurationManager.ConnectionStrings["Database"].ConnectionString;
        using (var sourceSqlConnection = new SqlConnection(connectionString))
        {
            using (SqlCommand sqlCommand = sourceSqlConnection.CreateCommand())
            {
                sqlCommand.CommandText = $"SELECT FileName, FileExtension, UploadedDateTime, DATALENGTH(Content) as [ContentLength] FROM dbo.[File] WHERE FileId = '{fileId}'";
                sqlCommand.CommandType = CommandType.Text;
                sqlCommand.CommandTimeout = 300;
                sourceSqlConnection.Open();

                var reader = sqlCommand.ExecuteReader();
                while (reader.Read())
                {
                    file.FileId = fileId;
                    file.FileExtension = reader["FileExtension"].ToString();
                    file.FileName = reader["FileName"].ToString();
                    file.UploadedDateTime = (DateTime)reader["UploadedDateTime"];
                    file.Content = new byte[Convert.ToInt32(reader["ContentLength"])];
                }

                reader.Close();
                sourceSqlConnection.Close();
            }
        }
        file.Content = GetFileContent(file.FileId, file.Content.Length);
        return file;
    }

并获取内容:

    private byte[] GetFileContent(Guid fileId, int contentLength)
    {
        int outputSize = 1048576;
        int bufferSize = contentLength + outputSize;
        byte[] content = new byte[bufferSize];
        string connectionString = ConfigurationManager.ConnectionStrings["Database"].ConnectionString;

        using (SqlConnection sqlConnection = new SqlConnection(connectionString))
        {
            using (SqlCommand sqlCommand = sqlConnection.CreateCommand())
            {
                sqlCommand.CommandTimeout = 300;
                sqlCommand.CommandText = $"SELECT Content FROM dbo.[File] WHERE FileId = '{fileId}'";
                sqlConnection.Open();
                using (SqlDataReader reader = sqlCommand.ExecuteReader(CommandBehavior.SequentialAccess))
                {

                    while (reader.Read())
                    {
                        int startIndex = 0;
                        long returnValue = reader.GetBytes(0, startIndex, content, startIndex, outputSize);
                        while (returnValue == outputSize)
                        {
                            startIndex += outputSize;
                            returnValue = reader.GetBytes(0, startIndex, content, startIndex, outputSize);
                        }
                    }
                }

                sqlConnection.Close();
            }
        }
        return content;
    }