如何将包含 PDF 文件数据的两个内存流合并为一个

Question

我正在尝试将两个 PDF 文件读入两个内存流，然后 return 一个将包含两个流数据的流。但是我似乎不明白我的代码有什么问题。

示例代码：

string file1Path = "Sampl1.pdf";
string file2Path = "Sample2.pdf";
MemoryStream stream1 = new MemoryStream(File.ReadAllBytes(file1Path));
MemoryStream stream2 = new MemoryStream(File.ReadAllBytes(file2Path));
stream1.Position = 0;
stream1.Copyto(stream2);
return stream2;   /*supposed to be containing data of both stream1 and stream2 but contains data of stream1 only*/

Answer 1

注意：

The whole question is based on a false premise, that you can produce a combined PDF file by merging the binaries of two PDF files. This works for plain text files for example (to an extent), but definitely doesn't work for PDFs. The answer only addresses how to merge two binary data streams, not how to merge two PDF files in particular. It answers the OP's question as asked, but doesn't actually solve his problem.

当您对 MemoryStream 使用 byte[] 构造函数时，内存流将不会随着您添加更多数据而扩展。所以它对于 stream1 和 stream2 来说都不够大。此外，该位置将从零开始，因此您要用 stream1.

中的数据覆盖 stream2

修复相当简单：

var result = new MemoryStream();
using (var file1 = File.OpenRead(file1Path)) file1.CopyTo(result);
using (var file2 = File.OpenRead(file2Path)) file2.CopyTo(result);

另一种选择是创建您自己的流 class，它将是两个独立流的组合 - 如果您对可组合性感兴趣，这很有趣，但对于像这样简单的事情来说可能有点矫枉过正 :)

只是为了好玩，它可能看起来像这样：

public class DualStream : Stream
{
    private readonly Stream _first;
    private readonly Stream _second;

    public DualStream(Stream first, Stream second)
    {
        _first = first;
        _second = second;
    }

    public override bool CanRead => true;
    public override bool CanSeek => true;
    public override bool CanWrite => false;
    public override long Length => _first.Length + _second.Length;

    public override long Position
    {
        get { return _first.Position + _second.Position; }
        set { Seek(value, SeekOrigin.Begin); }
    }

    public override void Flush() { throw new NotImplementedException(); }

    public override int Read(byte[] buffer, int offset, int count)
    {
        var bytesRead = _first.Read(buffer, offset, count);

        if (bytesRead == count) return bytesRead;

        return bytesRead + _second.Read(buffer, offset + bytesRead, count - bytesRead);
    }

    public override long Seek(long offset, SeekOrigin origin)
    {
        // To simplify, let's assume seek always works as if over one big MemoryStream
        long targetPosition;

        switch (origin)
        {
            case SeekOrigin.Begin: targetPosition = offset; break;
            case SeekOrigin.Current: targetPosition = Position + offset; break;
            case SeekOrigin.End: targetPosition = Length - offset; break;
            default: throw new NotSupportedException();
        }

        targetPosition = Math.Max(0, Math.Min(Length, targetPosition));

        var firstPosition = Math.Min(_first.Length, targetPosition);
        _first.Position = firstPosition;
        _second.Position = Math.Max(0, targetPosition - firstPosition);

        return Position;
    }

    protected override void Dispose(bool disposing)
    {
        if (disposing)
        {
            _first.Dispose();
            _second.Dispose();
        }

        base.Dispose(disposing);
    }

    public override void SetLength(long value) 
      { throw new NotImplementedException(); }
    public override void Write(byte[] buffer, int offset, int count) 
      { throw new NotImplementedException(); }
}

主要好处是，这意味着您不必为了合并流而分配不必要的内存缓冲区 - 如果您敢的话，它甚至可以直接与文件流一起使用 :D 而且很容易可组合 - 您可以制作其他双流的双流，允许您将任意数量的流链接在一起 - 与 IEnumerable.Concat.

几乎相同

Answer 2

在 PDF 文件的情况下，内存流的合并与 .txt 文件不同。对于 PDF，您需要像我使用的那样使用一些 .dll iTextSharp.dll（在 AGPL 许可下可用），然后使用该库的函数将它们组合起来，如下所示：

MemoryStream finalStream = new MemoryStream();
PdfCopyFields copy = new PdfCopyFields(finalStream);
string file1Path = "Sample1.pdf";
string file2Path = "Sample2.pdf";

var ms1 = new MemoryStream(File.ReadAllBytes(file1Path));
ms1.Position = 0;
copy.AddDocument(new PdfReader(ms1));
ms1.Dispose();

var ms2 = new MemoryStream(File.ReadAllBytes(file2Path));
ms2.Position = 0;
copy.AddDocument(new PdfReader(ms2));
ms2.Dispose();
copy.Close();

finalStream 包含 ms1 和 ms2 的合并 pdf。

如何将包含 PDF 文件数据的两个内存流合并为一个

How to Merge two memory streams containing PDF file's data into one

c#

memorystream

itextsharp