使用小数时文件拆分器方法出错

File splitter method goes wrong when working with decimals

我正在尝试开发一种文件拆分器方法,将文件拆分为所需大小的块,它非常适合具有偶数文件大小值的文件(例如:如果文件大小为 2097152 字节,我想把它分成两块,第一块是 1048576 字节,第二块是 1048576字节),

问题是当我尝试拆分一个文件时,当我划分它的文件大小时它有小数,例如我想将一个 8194321 字节的文件分成两个(或不管)块,一半的文件大小是 4097160,5 字节但是因为我需要使用整数然后我将块大小设置为 4097161 字节来创建两个块,第一个块 4097161 字节和第二个块 4097160 字节,但是当我尝试拆分文件时,当我处理最后一个块时在此指令上获得 System.ArgumentException 异常:

outputStream.Write(buffer, bufferLength * bufferCount, tmpBufferLength)

出现此错误消息:

Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source collection.

如何修复我的文件拆分器方法以正确拆分在拆分时有小数的文件?

这是一个用法示例:

Split(sourceFile:=Me.fileToSplit,
      chunkSize:=CInt(New FileInfo(fileToSplit).Length / 2),
      chunkName:="File.Part",
      chunkExt:="fs")

文件分割程序的相关代码:

''' <summary>
''' Splits a file into manageable chunks.
''' </summary>
''' <param name="sourceFile">The file to split.</param>
''' <param name="chunkSize">The size per chunk.</param>
''' <param name="chunkName">The name formatting for chunks.</param>
''' <param name="chunkExt">The file-extension for chunks.</param>
Public Sub Split(ByVal sourceFile As String,
                 ByVal chunkSize As Integer,
                 ByVal chunkName As String,
                 ByVal chunkExt As String)

    ' FileInfo instance of the source file.
    Dim fInfo As New FileInfo(sourceFile)

    ' The total filesize to split, in bytes.
    Dim totalSize As Long = fInfo.Length

    ' The remaining size to calculate the percentage, in bytes.
    Dim sizeRemaining As Long = totalSize

    ' Counts the length of the current chunk file to calculate the percentage, in bytes.
    Dim sizeWritten As Long = 0L

    ' The buffer to read data and write the chunks.
    Dim buffer As Byte() = New Byte() {}

    ' The buffer length.
    Dim bufferLength As Integer = 524288 ' 512 Kb

    ' The total amount of chunks to create.
    Dim chunkCount As Long = CLng(Math.Ceiling((fInfo.Length - bufferLength) / (chunkSize)))

    ' Keeps track of the current chunk.
    Dim chunkIndex As Long = 0L

    ' A zero-filled string to enumerate the chunk parts.
    Dim enumeration As String = String.Empty

    ' The chunks filename.
    Dim chunkFilename As String = String.Empty

    ' Open the file to start reading bytes.
    Using inputStream As New FileStream(fInfo.FullName, FileMode.Open)

        Using binaryReader As New BinaryReader(inputStream)

            While (inputStream.Position < inputStream.Length)

                chunkIndex += 1L 'Increment the chunk file counter.

                ' Set chunk filename.
                enumeration = New String("0"c, CStr(chunkCount).Length - CStr(chunkIndex).Length)
                chunkFilename = String.Format("{0}.{1}.{2}", chunkName, enumeration & CStr(chunkIndex), chunkExt)

                ' Reset written byte-length counter.
                sizeWritten = 0L

                ' Create the chunk file to Write the bytes.
                Using outputStream As New FileStream(chunkFilename, FileMode.Create)

                    ' Read until reached the end-bytes of the input file.
                    While (sizeWritten < chunkSize) AndAlso (inputStream.Position < inputStream.Length)

                        ' Read bytes from the source file.
                        buffer = binaryReader.ReadBytes(chunkSize)

                        Dim bufferCount As Integer = 0
                        Dim tmpBufferLength As Integer = bufferLength

                        While (sizeWritten < chunkSize)

                            If (bufferLength + (bufferLength * bufferCount)) >= chunkSize Then
                                tmpBufferLength = chunkSize - ((bufferLength * bufferCount))
                            End If

                            ' Write those bytes in the chunk file.
                            outputStream.Write(buffer, bufferLength * bufferCount, tmpBufferLength)

                            bufferCount += 1

                            ' Increment the bytes-written counter.
                            sizeWritten += tmpBufferLength

                            ' Decrease the bytes-remaining counter.
                            sizeRemaining -= tmpBufferLength

                            ' Reset the temporal buffer length.
                            tmpBufferLength = bufferLength

                        End While

                    End While ' (sizeWritten < chunkSize) AndAlso (inputStream.Position < inputStream.Length)

                    outputStream.Flush()

                End Using ' outputStream

            End While ' inputStream.Position < inputStream.Length

        End Using ' binaryReader

    End Using ' inputStream

End Sub

编辑: 我忘了提到 While (sizeWritten < chunkSize) 块是因为在该块内我触发了一些事件,而不是一次写入整个缓冲区我使用它while 循环到 "slowly" 写入另一个缓冲区,这样我就以精确的大小拆分文件,除了文件大小在拆分时有小数的文件外,然后抛出我提到的那个异常。

在真正阅读之前,您需要计算出正确的阅读量。现在,您总是读取 chunkSize 个字节,但有时您会丢弃该缓冲区的尾部。

我认为本意是 tmpBufferLength 具有正确的缓冲区长度。假设可行(我懒得验证...)从源中准确读取该数量,然后将整个缓冲区写入目标。

这是一个更简单的版本,它应该可以做你想做的,没有大块大小的戏剧性。此代码是用 C# 编写的,并使用此工具转换为 VB: http://converter.telerik.com/

Public Sub Split(sourceFile As String,
                 chunkSize As Integer,
                 chunkName As String,
                 chunkExt As String)
    Using source As New FileStream(sourceFile, FileMode.Open, FileAccess.Read, FileShare.Read)
        Dim buffer As Byte() = New Byte(8192) {}
        Dim chunkIndex As Integer = 0

        While source.Position < source.Length
            chunkIndex += 1

            Using chunk As New FileStream(String.Format("{0}.{1:0000}.{2}", chunkName, chunkIndex, chunkExt), FileMode.Create, FileAccess.Write, FileShare.Read)
                While source.Position < source.Length AndAlso chunk.Position < CLng(chunkSize)
                    Dim len As Integer = Math.Min(buffer.Length, Math.Min(chunkSize, CInt(source.Length - source.Position)))

                    len = source.Read(buffer, 0, len)
                    chunk.Write(buffer, 0, len)
                End While
            End Using
        End While
    End Using
End Sub

最后,我将分享我的以精确大小拆分文件的解决方案:

(需要从该代码中剥离事件使用代码)

''' <summary>
''' Splits a file into manageable chunks, or merge the splitted chunks. 
''' With progress-percent features.
''' </summary>
Public NotInheritable Class FileSplitter

...

''' <summary>
''' Gets or sets the buffer-size used to split or merge, in Bytes.
''' Default value is: 524288 bytes (512 Kb).
''' </summary>
''' <value>The buffer-size.</value>
Public Property BufferSize As Integer = 524288
'    4096 Bytes (  4 Kb) This is the default Microsoft's FileStream implementation buffer size.
'    8192 Bytes (  8 Kb)
'   16384 Bytes ( 16 Kb)
'   32768 Bytes ( 32 Kb)
'   65536 Bytes ( 64 Kb)
'  131072 Bytes (128 Kb)
'  262144 Bytes (256 Kb)
'  524288 Bytes (512 Kb)
' 1048576 Bytes (  1 Mb)

''' <summary>
''' Splits a file into manageable chunks.
''' </summary>
''' <param name="sourceFile">The file to split.</param>
''' <param name="chunkSize">The size per chunk.</param>
''' <param name="chunkName">The name formatting for chunks.</param>
''' <param name="chunkExt">The file-extension for chunks.</param>
''' <param name="overwrite">If set to <c>True</c>, any existing file will be overwritten if needed to create a chunk, otherwise, an exception will be thrown.</param>
''' <param name="deleteAfterSplit">If set to <c>True</c>, the input file will be deleted after a successful split operation.</param>
''' <exception cref="System.IO.FileNotFoundException">The specified source file doesn't exists.</exception>
''' <exception cref="System.IO.IOException">File already exists.</exception>
''' <exception cref="System.OverflowException">'chunkSize' value should be smaller than the source filesize.</exception>
Public Sub Split(ByVal sourceFile As String,
                 ByVal chunkSize As Long,
                 Optional ByVal chunkName As String = Nothing,
                 Optional ByVal chunkExt As String = Nothing,
                 Optional ByVal overwrite As Boolean = False,
                 Optional ByVal deleteAfterSplit As Boolean = False)

    If Not File.Exists(sourceFile) Then
        Throw New FileNotFoundException("The specified source file doesn't exists.", sourceFile)
        Exit Sub
    End If

    ' FileInfo instance of the source file.
    Dim fInfo As New FileInfo(sourceFile)

    ' The total filesize to split, in bytes.
    Dim totalSize As Long = fInfo.Length

    ' The remaining size to calculate the percentage, in bytes.
    Dim sizeRemaining As Long = totalSize

    ' Counts the length of the current chunk file to calculate the percentage, in bytes.
    Dim sizeWritten As Long

    ' The buffer to read data and write the chunks.
    Dim buffer As Byte()

    ' The buffer length.
    Dim bufferLength As Integer

    ' The total amount of chunks to create.
    Dim chunkCount As Long = CLng(Math.Ceiling((fInfo.Length) / (chunkSize)))

    ' Keeps track of the current chunk.
    Dim chunkIndex As Long

    ' Keeps track of the amount of buffer-writting operations.
    Dim writeCounts As Integer

    ' Keeps track of the current buffer-writting operation.
    Dim writeCount As Integer

    ' A zero-filled string to enumerate the chunk parts.
    Dim fileEnumeration As String

    ' The chunks filename.
    Dim chunkFilename As String

    ' The chunks basename.
    chunkName = If(String.IsNullOrEmpty(chunkName),
                   Path.Combine(fInfo.DirectoryName, Path.GetFileNameWithoutExtension(fInfo.Name)),
                   Path.Combine(fInfo.DirectoryName, chunkName))

    ' The chunks file extension.
    chunkExt = If(String.IsNullOrEmpty(chunkExt),
                  fInfo.Extension.Substring(1),
                  chunkExt)

    Select Case chunkSize ' Set buffer size and calculate chunk count.

        Case Is >= fInfo.Length ' chunk size is bigger than source-file size.
            Throw New OverflowException("'chunkSize' value should be smaller than the source filesize.")
            Exit Sub

        Case Is < Me.BufferSize ' chunk size is smaller than buffer size.
            bufferLength = CInt(chunkSize)

        Case Else ' chunk size is bigger than buffer size.
            bufferLength = Me.BufferSize

    End Select ' chunkSize

    If Not overwrite Then ' If not file overwriting is allowed then...

        For index As Long = 1L To chunkCount ' Start index based on 1 (eg. "File.Part.1.ext").

            ' Set chunk filename.
            fileEnumeration = New String("0"c, CStr(chunkCount).Length - CStr(index).Length)
            chunkFilename = String.Format("{0}.{1}.{2}", chunkName, fileEnumeration & CStr(index), chunkExt)

            ' If chunk file already exists then...
            If File.Exists(chunkFilename) Then
                Throw New IOException(String.Format("File already exists: {0}", chunkFilename))
                Exit Sub
            End If

        Next index

    End If ' overwrite

    ' Open the file to start reading bytes.
    Using inputStream As New FileStream(fInfo.FullName, FileMode.Open)

        Using binaryReader As New BinaryReader(inputStream)

            While (inputStream.Position < inputStream.Length)

                ' Increment the chunk file counter.
                chunkIndex += 1L

                ' Set chunk filename.
                fileEnumeration = New String("0"c, CStr(chunkCount).Length - CStr(chunkIndex).Length)
                chunkFilename = String.Format("{0}.{1}.{2}", chunkName, fileEnumeration & CStr(chunkIndex), chunkExt)

                ' Reset written byte-length counter.
                sizeWritten = 0L

                ' Create the chunk file to Write the bytes.
                Using outputStream As New FileStream(chunkFilename, FileMode.Create)

                    ' Calculate the amount of buffer-writting operations.
                    writeCounts = CInt(Math.Ceiling(chunkSize / bufferLength))
                    writeCount = 0

                    ' Read until reached the end-bytes of the input file.
                    While (inputStream.Position < inputStream.Length) AndAlso (sizeWritten < chunkSize)

                        ' Increment the buffer-writting counter.
                        writeCount += 1

                        ' If buffer-writting operation is the last buffer-writting operation then...
                        If (writeCount = writeCounts) Then
                            ' Fix buffer size for writting the last buffer-data.
                            bufferLength = CInt(chunkSize - sizeWritten)
                        End If

                        ' Read bytes from the input file).
                        buffer = binaryReader.ReadBytes(bufferLength)

                        ' Write those bytes in the chunk file.
                        outputStream.Write(buffer, 0, buffer.Length)

                        ' Increment the bytes-written counter.
                        sizeWritten += buffer.Length

                        ' Decrease the bytes-remaining counter.
                        sizeRemaining -= buffer.Length

                    End While ' (inputStream.Position < inputStream.Length) AndAlso (sizeWritten < chunkSize)

                    outputStream.Flush()

                End Using ' outputStream

            End While ' (inputStream.Position < inputStream.Length)

        End Using ' binaryReader

    End Using ' inputStream

End Sub

...

End Class