从 db 读取 200MB 文件抛出内存不足异常
Reading 200MB file from db throws Out Of Memory Exception
我正在尝试查询数据库并提取 excel 可能大到 100 万行的文件 (~200MB) 存储为 varbinary
并通过验证器。
我们的构建服务器有 6GB 内存和一个负载平衡的处理器,并且在运行时远未达到 CPU 或内存最大化。
然而,大约 40 秒后,进程抛出 OutOfMemoryException
。
这是堆栈跟踪:
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at System.Data.SqlTypes.SqlBinary.get_Value()
at System.Data.SqlClient.SqlBuffer.get_ByteArray()
at System.Data.SqlClient.SqlBuffer.get_Value()
at System.Data.SqlClient.SqlDataReader.GetValueFromSqlBufferInternal(SqlBuffer data, _SqlMetaData metaData)
at System.Data.SqlClient.SqlDataReader.GetValueInternal(Int32 i)
at System.Data.SqlClient.SqlDataReader.GetValue(Int32 i)
at System.Data.SqlClient.SqlCommand.CompleteExecuteScalar(SqlDataReader ds, Boolean returnSqlValue)
at System.Data.SqlClient.SqlCommand.ExecuteScalar()
at eConfirmations.DataService.FileServices.FileDataService.GetFileContent(Guid fileId) in d:\w1\s\Source\eConfirmations.DataService\FileServices\FileDataService.cs:line 157
...
at System.Data.SqlTypes.SqlBinary.get_Value()
at System.Data.SqlClient.SqlBuffer.get_ByteArray()
at System.Data.SqlClient.SqlBuffer.get_Value()
at System.Data.SqlClient.SqlDataReader.GetValueFromSqlBufferInternal(SqlBuffer data, _SqlMetaData metaData)
at System.Data.SqlClient.SqlDataReader.GetValueInternal(Int32 i)
at System.Data.SqlClient.SqlDataReader.GetValue(Int32 i)
at System.Data.SqlClient.SqlCommand.CompleteExecuteScalar(SqlDataReader ds, Boolean returnSqlValue)
at System.Data.SqlClient.SqlCommand.ExecuteScalar()
at eConfirmations.DataService.FileServices.FileDataService.GetFileContent(Guid fileId) in d:\w1\s\Source\eConfirmations.DataService\FileServices\FileDataService.cs:line 157
这是我抛出异常的代码:
private byte[] GetFileContent(Guid fileId)
{
byte[] content;
string connectionString = ConfigurationManager.ConnectionStrings["eConfirmationsDatabase"].ConnectionString;
using (SqlConnection sqlConnection = new SqlConnection(connectionString))
{
using (SqlCommand sqlCommand = sqlConnection.CreateCommand())
{
sqlCommand.CommandTimeout = 300;
sqlCommand.CommandText = $"SELECT Content FROM dbo.[File] WHERE FileId = '{fileId}'";
sqlConnection.Open();
content = sqlCommand.ExecuteScalar() as byte[];
sqlConnection.Close();
sqlCommand.Dispose();
}
sqlConnection.Dispose();
}
return content;
}
是否有更有效的方法来提取此数据,或者我们可以更新构建服务器上的设置以避免此错误吗?
好的,这是正在发生的事情:
因为这是 运行 在 32 位版本上,最大内存分配是 2GB,但我离这个阈值还很远。
根据与我的情况非常相似的 this Whosebug post,.NET 框架将对象限制在内存中 256MB
。
因此,即使我的文件只有 200MB,byte[]
s 和 MemoryStreams
也会按 2 的幂扩展,直到达到所需的 256MB。当它们扩展时,它们会创建一个适当大小的新实例并将旧数据复制到新数据,有效地将内存使用量乘以 3,这会导致异常。
MSDN has an example of how to retrieve a large file using a FileStream, but instead of a FileStream, I use a static byte[] pre-initialized to the size of my data using this post.
这是我的最终解决方案:
public File GetFileViaFileIdGuid(Guid fileId)
{
File file = new File();
string connectionString = ConfigurationManager.ConnectionStrings["Database"].ConnectionString;
using (var sourceSqlConnection = new SqlConnection(connectionString))
{
using (SqlCommand sqlCommand = sourceSqlConnection.CreateCommand())
{
sqlCommand.CommandText = $"SELECT FileName, FileExtension, UploadedDateTime, DATALENGTH(Content) as [ContentLength] FROM dbo.[File] WHERE FileId = '{fileId}'";
sqlCommand.CommandType = CommandType.Text;
sqlCommand.CommandTimeout = 300;
sourceSqlConnection.Open();
var reader = sqlCommand.ExecuteReader();
while (reader.Read())
{
file.FileId = fileId;
file.FileExtension = reader["FileExtension"].ToString();
file.FileName = reader["FileName"].ToString();
file.UploadedDateTime = (DateTime)reader["UploadedDateTime"];
file.Content = new byte[Convert.ToInt32(reader["ContentLength"])];
}
reader.Close();
sourceSqlConnection.Close();
}
}
file.Content = GetFileContent(file.FileId, file.Content.Length);
return file;
}
并获取内容:
private byte[] GetFileContent(Guid fileId, int contentLength)
{
int outputSize = 1048576;
int bufferSize = contentLength + outputSize;
byte[] content = new byte[bufferSize];
string connectionString = ConfigurationManager.ConnectionStrings["Database"].ConnectionString;
using (SqlConnection sqlConnection = new SqlConnection(connectionString))
{
using (SqlCommand sqlCommand = sqlConnection.CreateCommand())
{
sqlCommand.CommandTimeout = 300;
sqlCommand.CommandText = $"SELECT Content FROM dbo.[File] WHERE FileId = '{fileId}'";
sqlConnection.Open();
using (SqlDataReader reader = sqlCommand.ExecuteReader(CommandBehavior.SequentialAccess))
{
while (reader.Read())
{
int startIndex = 0;
long returnValue = reader.GetBytes(0, startIndex, content, startIndex, outputSize);
while (returnValue == outputSize)
{
startIndex += outputSize;
returnValue = reader.GetBytes(0, startIndex, content, startIndex, outputSize);
}
}
}
sqlConnection.Close();
}
}
return content;
}
我正在尝试查询数据库并提取 excel 可能大到 100 万行的文件 (~200MB) 存储为 varbinary
并通过验证器。
我们的构建服务器有 6GB 内存和一个负载平衡的处理器,并且在运行时远未达到 CPU 或内存最大化。
然而,大约 40 秒后,进程抛出 OutOfMemoryException
。
这是堆栈跟踪:
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at System.Data.SqlTypes.SqlBinary.get_Value()
at System.Data.SqlClient.SqlBuffer.get_ByteArray()
at System.Data.SqlClient.SqlBuffer.get_Value()
at System.Data.SqlClient.SqlDataReader.GetValueFromSqlBufferInternal(SqlBuffer data, _SqlMetaData metaData)
at System.Data.SqlClient.SqlDataReader.GetValueInternal(Int32 i)
at System.Data.SqlClient.SqlDataReader.GetValue(Int32 i)
at System.Data.SqlClient.SqlCommand.CompleteExecuteScalar(SqlDataReader ds, Boolean returnSqlValue)
at System.Data.SqlClient.SqlCommand.ExecuteScalar()
at eConfirmations.DataService.FileServices.FileDataService.GetFileContent(Guid fileId) in d:\w1\s\Source\eConfirmations.DataService\FileServices\FileDataService.cs:line 157
...
at System.Data.SqlTypes.SqlBinary.get_Value()
at System.Data.SqlClient.SqlBuffer.get_ByteArray()
at System.Data.SqlClient.SqlBuffer.get_Value()
at System.Data.SqlClient.SqlDataReader.GetValueFromSqlBufferInternal(SqlBuffer data, _SqlMetaData metaData)
at System.Data.SqlClient.SqlDataReader.GetValueInternal(Int32 i)
at System.Data.SqlClient.SqlDataReader.GetValue(Int32 i)
at System.Data.SqlClient.SqlCommand.CompleteExecuteScalar(SqlDataReader ds, Boolean returnSqlValue)
at System.Data.SqlClient.SqlCommand.ExecuteScalar()
at eConfirmations.DataService.FileServices.FileDataService.GetFileContent(Guid fileId) in d:\w1\s\Source\eConfirmations.DataService\FileServices\FileDataService.cs:line 157
这是我抛出异常的代码:
private byte[] GetFileContent(Guid fileId)
{
byte[] content;
string connectionString = ConfigurationManager.ConnectionStrings["eConfirmationsDatabase"].ConnectionString;
using (SqlConnection sqlConnection = new SqlConnection(connectionString))
{
using (SqlCommand sqlCommand = sqlConnection.CreateCommand())
{
sqlCommand.CommandTimeout = 300;
sqlCommand.CommandText = $"SELECT Content FROM dbo.[File] WHERE FileId = '{fileId}'";
sqlConnection.Open();
content = sqlCommand.ExecuteScalar() as byte[];
sqlConnection.Close();
sqlCommand.Dispose();
}
sqlConnection.Dispose();
}
return content;
}
是否有更有效的方法来提取此数据,或者我们可以更新构建服务器上的设置以避免此错误吗?
好的,这是正在发生的事情:
因为这是 运行 在 32 位版本上,最大内存分配是 2GB,但我离这个阈值还很远。
根据与我的情况非常相似的 this Whosebug post,.NET 框架将对象限制在内存中 256MB
。
因此,即使我的文件只有 200MB,byte[]
s 和 MemoryStreams
也会按 2 的幂扩展,直到达到所需的 256MB。当它们扩展时,它们会创建一个适当大小的新实例并将旧数据复制到新数据,有效地将内存使用量乘以 3,这会导致异常。
MSDN has an example of how to retrieve a large file using a FileStream, but instead of a FileStream, I use a static byte[] pre-initialized to the size of my data using this post.
这是我的最终解决方案:
public File GetFileViaFileIdGuid(Guid fileId)
{
File file = new File();
string connectionString = ConfigurationManager.ConnectionStrings["Database"].ConnectionString;
using (var sourceSqlConnection = new SqlConnection(connectionString))
{
using (SqlCommand sqlCommand = sourceSqlConnection.CreateCommand())
{
sqlCommand.CommandText = $"SELECT FileName, FileExtension, UploadedDateTime, DATALENGTH(Content) as [ContentLength] FROM dbo.[File] WHERE FileId = '{fileId}'";
sqlCommand.CommandType = CommandType.Text;
sqlCommand.CommandTimeout = 300;
sourceSqlConnection.Open();
var reader = sqlCommand.ExecuteReader();
while (reader.Read())
{
file.FileId = fileId;
file.FileExtension = reader["FileExtension"].ToString();
file.FileName = reader["FileName"].ToString();
file.UploadedDateTime = (DateTime)reader["UploadedDateTime"];
file.Content = new byte[Convert.ToInt32(reader["ContentLength"])];
}
reader.Close();
sourceSqlConnection.Close();
}
}
file.Content = GetFileContent(file.FileId, file.Content.Length);
return file;
}
并获取内容:
private byte[] GetFileContent(Guid fileId, int contentLength)
{
int outputSize = 1048576;
int bufferSize = contentLength + outputSize;
byte[] content = new byte[bufferSize];
string connectionString = ConfigurationManager.ConnectionStrings["Database"].ConnectionString;
using (SqlConnection sqlConnection = new SqlConnection(connectionString))
{
using (SqlCommand sqlCommand = sqlConnection.CreateCommand())
{
sqlCommand.CommandTimeout = 300;
sqlCommand.CommandText = $"SELECT Content FROM dbo.[File] WHERE FileId = '{fileId}'";
sqlConnection.Open();
using (SqlDataReader reader = sqlCommand.ExecuteReader(CommandBehavior.SequentialAccess))
{
while (reader.Read())
{
int startIndex = 0;
long returnValue = reader.GetBytes(0, startIndex, content, startIndex, outputSize);
while (returnValue == outputSize)
{
startIndex += outputSize;
returnValue = reader.GetBytes(0, startIndex, content, startIndex, outputSize);
}
}
}
sqlConnection.Close();
}
}
return content;
}