从 SQL 服务器提取大 blob 到文件在 PowerShell 中需要很长时间

Extracting a large blob from SQL Server to a file takes a very long time in PowerShell

我一直在研究一种自动将 blob 列提取到文件的方法,这篇 blog 详细介绍了几种方法。

通过 BCP,可以非常快速地从我的数据库中提取较大的文件。我能够在 20 秒内提取 2 GB 的文件。这是我使用的示例命令行,基于博客中的示例:

BCP "SELECT PictureData FROM BLOB_Test.dbo.PicturesTest " QUERYOUT C:\BLOBTest\BlobOut\WC.jpg -T -f "C:\BLOBTest\FormatFile\BLOB.fmt" -S <ServerName>\<InstanceName>

顺便说一句,我不得不学习如何应用格式文件来防止前缀字符串被插入到文件中。此格式文件必须采用 BCP 的旧格式,因为格式文件的较新 XML 版本具有“PREFIX_LENGTH”的架构条目,可防止出现 0 值。

我宁愿使用 PowerShell 来提取 blob,但是以下基于 TechNet article 的代码需要大约两天的时间来处理,而不是 BCP 的 20 秒来处理相同的 2 gig blob。

## https://social.technet.microsoft.com/wiki/contents/articles/890.export-sql-server-blob-data-with-powershell.aspx
## Export of "larger" SQL Server blob to file with GetBytes-Stream

# Configuration data
   $Server     = ".\<Instance>";         # SQL Server Instance
   $Database   = "Blob_Test";            # Name of database
   $Dest       = "C:\BLOBTest\BLOBOut\"; # Path to export to
   $bufferSize = 8192;                   # Stream buffer size in bytes

# Select-Statement for name & blob with filter
   $Sql = "Select 
              [PictureName],
              [PictureData]
           From 
              dbo.PicturesTest";

# Open ADO.NET Connection
   $con = New-Object Data.SqlClient.SqlConnection;
   $con.ConnectionString = "Data Source=$Server;" +
                           "Integrated Security=True;" +
                           "Initial Catalog=$Database";
   $con.Open();

# New Command and Reader
   $cmd = New-Object Data.SqlClient.SqlCommand $Sql, $con;
   $rd  = $cmd.ExecuteReader();

# Create a byte array for the stream
   $out = [array]::CreateInstance('Byte', $bufferSize)

# Loop through records
   While ($rd.Read()) {
      Write-Output ("Exporting: {0}" -f $rd.GetString(0));
      
      # New BinaryWriter
         $fs = New-Object System.IO.FileStream ($Dest + $rd.GetString(0)), Create, Write;
         $bw = New-Object System.IO.BinaryWriter $fs;
      
         $start = 0;

      # Read first byte stream
         $received = $rd.GetBytes(1, $start, $out, 0, $bufferSize - 1);
      
      While ($received -gt 0) {
         $bw.Write($out, 0, $received);
         $bw.Flush();
         $start += $received;
         
         # Read next byte stream
            $received = $rd.GetBytes(1, $start, $out, 0, $bufferSize - 1);
      }
      
      $bw.Close();
      $fs.Close();
   }

# Closing & disposing all objects
   $fs.Dispose();
   $rd.Close();
   $cmd.Dispose();
   $con.Close();

Write-Output ("Finished");

它最终完成了,但我不知道为什么脚本需要这么长时间才能完成。

有人知道 PowerShell 脚本被绝育的原因吗?

您根本不需要 BinaryWriter。 class 只是 意味着以特定于 .NET 的格式编写原始类型,如整数、双精度数、字符串等。很少用。

如果你想将字节写入文件,你只需要使用 Stream.Write :

$fs.Write $received

一个可以消除几乎所有代码的更好的想法是使用DbDataReader.GetStream instead of GetBytes to read the BLOB as a stream. After that you can use Stream.CopyTo将流的内容写入另一个流:

$dbFs=$rd.GetStream(1); 
$dbFs.CopyTo($fs);

这个 2010 年的脚本仍然有效并且运行速度非常快。

$server = ".";
$database = "YourDatab";
$query = "SELECT FileContent,FileName FROM dbo.FileUploads";
$dirPath = "C:\Data\"
 
$connection=new-object System.Data.SqlClient.SQLConnection
$connection.ConnectionString="Server={0};Database={1};Integrated Security=True" -f $server,$database
$command=new-object system.Data.SqlClient.SqlCommand($query,$connection)
$command.CommandTimeout=120
$connection.Open()
$reader = $command.ExecuteReader()
while ($reader.Read())
{
    $sqlBytes = $reader.GetSqlBytes(0)
    $filepath = "$dirPath{0}" -f $reader.GetValue(1)
    $buffer = new-object byte[] -ArgumentList $reader.GetBytes(0,0,$null,0,$sqlBytes.Length)
    $reader.GetBytes(0,0,$buffer,0,$buffer.Length)
    $fs = new-object System.IO.FileStream($filePath,[System.IO.FileMode]'Create',[System.IO.FileAccess]'Write')
    $fs.Write($buffer, 0, $buffer.Length)
    $fs.Close()
}
$reader.Close()
$connection.Close()

来源:https://www.sqlservercentral.com/blogs/t-sql-tuesday-006-blobs-filestream-and-powershell