使用 C# 中的 Word Interop 以 100% 保真度将文档内容(包括格式和页面格式)复制到另一个

Copy document content (including formatting and page format) to another using Word Interop in c# with 100% fidelity

我想将用户创建的文档内容复制到现有文档中。现有文档内容必须与用户创建的文档完全相同。

我不能简单地使用 System.IO 复制文件或保存用户使用 Word Interop 中的 SaveAs 方法创建的文档的副本。这是因为现有文档是从网络服务器生成的文档,并且具有 VBA 用于将其上传回服务器的模块。

网络服务器生成的文档(现有文档)是Word 2003文档,但用户创建的文档是Word 2003文档或Word 2007+。

考虑到这些限制,我首先创建了以下方法:

string tempsave = //location of user created document;
string savelocation = //location of existing document;
Word.Application objWordOpen = new Word.Application();
Document doclocal = objWordOpen.Documents.Open(tempsave);
Document d1 = objWordOpen.Documents.Open(savelocation);
Word.Range oRange = doclocal.Content;
oRange.Copy();
d1.Activate();
d1.UpdateStyles();
d1.ActiveWindow.Selection.WholeStory();
d1.ActiveWindow.Selection.PasteAndFormat(Word.WdRecoveryType.wdFormatOriginalFormatting);

这通常有效。但是,表格乱七八糟。

此外,如果有分页符,输出会不同。

用户创建文档:

输出-现有文档:

另外,在文档末尾添加段落标记,如下:

用户创建文档:

输出-现有文档:

页面格式也乱了,输出设置了镜像边距。

用户创建文档:

输出-现有文档:

我也尝试过使用 Range.Insert() 方法并设置范围而不按照此处所述进行复制 ,但我仍然遇到这些问题。

我也尝试过将 VBA 模块添加到文档中,但还有文档变量和其他自定义属性,我不想弄乱正在上传到服务器的文件。 我该如何处理这些问题?两个文件都基于 Normal 模板。

我愿意接受关于此主题的另一个建议,但我知道 .doc 文件不像 .docx 格式那样容易处理,这就是我认为我坚持使用 COM Interop 的原因。

谢谢。

更新 基于 Charles Kenyon 发布的 Macropod 代码,我设法将更多格式从源复制到目标。尽管如此,分页符还是有区别的——段落标记放在新页面上,而不是放在同一页上。 此外,即使字体大小相同,文本也略大。

            Word.Range oRange;
            oRange = Source.Content;
            Target.Content.FormattedText = oRange.FormattedText;
            LayoutTransfer(Source, Target);

LayoutTransfer 方法:

private void LayoutTransfer(Document source, Document target)
        {
            float sPageHght;
            float sPageWdth;
            float sHeaderDist;
            float sFooterDist;
            float sTMargin;
            float sBMargin;
            float sLMargin;
            float sRMargin;
            float sGutter;
            WdGutterStyle sGutterPos;
            WdPaperSize lPaperSize;
            WdGutterStyleOld lGutterStyle;
            int lMirrorMargins;
            WdVerticalAlignment lVerticalAlignment;
            WdSectionStart lScnStart;
            WdSectionDirection lScnDir;
            int lOddEvenHdFt;
            int lDiffFirstHdFt;
            bool bTwoPagesOnOne;
            bool bBkFldPrnt;
            int bBkFldPrnShts;
            bool bBkFldRevPrnt;
            WdOrientation lOrientation;
            foreach (Word.Section section in source.Sections)
            {
                lPaperSize = section.PageSetup.PaperSize;
                lGutterStyle = section.PageSetup.GutterStyle;
                lOrientation = section.PageSetup.Orientation;
                lMirrorMargins = section.PageSetup.MirrorMargins;
                lScnStart = section.PageSetup.SectionStart;
                lScnDir = section.PageSetup.SectionDirection;
                lOddEvenHdFt = section.PageSetup.OddAndEvenPagesHeaderFooter;
                lDiffFirstHdFt = section.PageSetup.DifferentFirstPageHeaderFooter;
                lVerticalAlignment = section.PageSetup.VerticalAlignment;
                sPageHght = section.PageSetup.PageHeight;
                sPageWdth = section.PageSetup.PageWidth;
                sTMargin = section.PageSetup.TopMargin;
                sBMargin = section.PageSetup.BottomMargin;
                sLMargin = section.PageSetup.LeftMargin;
                sRMargin = section.PageSetup.RightMargin;
                sGutter = section.PageSetup.Gutter;
                sGutterPos = section.PageSetup.GutterPos;
                sHeaderDist = section.PageSetup.HeaderDistance;
                sFooterDist = section.PageSetup.FooterDistance;
                bTwoPagesOnOne = section.PageSetup.TwoPagesOnOne;
                bBkFldPrnt = section.PageSetup.BookFoldPrinting;
                bBkFldPrnShts = section.PageSetup.BookFoldPrintingSheets;
                bBkFldRevPrnt = section.PageSetup.BookFoldRevPrinting;

                var index = section.Index;


                target.Sections[index].PageSetup.PaperSize = lPaperSize;
                target.Sections[index].PageSetup.GutterStyle = lGutterStyle;
                target.Sections[index].PageSetup.Orientation = lOrientation;
                target.Sections[index].PageSetup.MirrorMargins = lMirrorMargins;
                target.Sections[index].PageSetup.SectionStart = lScnStart;
                target.Sections[index].PageSetup.SectionDirection = lScnDir;
                target.Sections[index].PageSetup.OddAndEvenPagesHeaderFooter = lOddEvenHdFt;
                target.Sections[index].PageSetup.DifferentFirstPageHeaderFooter = lDiffFirstHdFt;
                target.Sections[index].PageSetup.VerticalAlignment = lVerticalAlignment;
                target.Sections[index].PageSetup.PageHeight = sPageHght;
                target.Sections[index].PageSetup.PageWidth = sPageWdth;
                target.Sections[index].PageSetup.TopMargin = sTMargin;
                target.Sections[index].PageSetup.BottomMargin = sBMargin;
                target.Sections[index].PageSetup.LeftMargin = sLMargin;
                target.Sections[index].PageSetup.RightMargin = sRMargin;
                target.Sections[index].PageSetup.Gutter = sGutter;
                target.Sections[index].PageSetup.GutterPos = sGutterPos;
                target.Sections[index].PageSetup.HeaderDistance = sHeaderDist;
                target.Sections[index].PageSetup.FooterDistance = sFooterDist;
                target.Sections[index].PageSetup.TwoPagesOnOne = bTwoPagesOnOne;
                target.Sections[index].PageSetup.BookFoldPrinting = bBkFldPrnt;
                target.Sections[index].PageSetup.BookFoldPrintingSheets = bBkFldPrnShts;
                target.Sections[index].PageSetup.BookFoldRevPrinting = bBkFldRevPrnt;
            }
        }

更新 2

其实,分页符不符合段落格式不是复制保真度的问题,而是.doc转.docx的问题。 (https://support.microsoft.com/en-us/help/923183/the-layout-of-a-document-that-contains-a-page-break-may-be-different-i) 也许有人想到了克服这个问题的方法。

Paul Edstein (macropod) 的以下代码可能会对您有所帮助。它至少会让您了解您所面临的复杂性。

' ============================================================================================================
' KEEP NEXT THREE TOGETHER 
' ============================================================================================================
'
Sub CombineDocuments()
' Paul Edstein
' https://www.msofficeforums.com/word-vba/43339-combine-multiple-word-documents.html
'
' Users occasionally need to combine multiple documents that may of may not have the same page layouts,
'   Style definitions, and so on. Consequently, combining multiple documents is often rather more complex than
'   simply copying & pasting content from one document to another. Problems arise when the documents have
'   different page layouts, headers, footers, page numbering, bookmarks & cross-references,
'   Tables of Contents, Indexes, etc., etc., and especially when those documents have used the same Style
'   names with different definitions.
'
' The following Word macro (for Windows PCs only) handles the more common issues that arise when combining
'   documents; it does not attempt to resolve conflicts with paragraph auto-numbering,
'   document -vs- section page numbering in 'page x of y' numbering schemes, Tables of Contents or Indexing issues.
'   Neither does it attempt to deal with the effects on footnote or endnote numbering & positioning or with the
'   consequences of duplicated bookmarks (only one of which can exist in the merged document) and any corresponding
'   cross-references.
'
' The macro includes a folder browser. Simply select the folder to process and all documents in that folder
'   will be combined into the currently-active document. Word's .doc, .docx, and .docm formats will all be processed,
'   even if different formats exist in the selected folder.
'
    Application.ScreenUpdating = False
    Dim strFolder As String, strFile As String, strTgt As String
    Dim wdDocTgt As Document, wdDocSrc As Document, HdFt As HeaderFooter
    strFolder = GetFolder: If strFolder = "" Then Exit Sub
    Set wdDocTgt = ActiveDocument: strTgt = ActiveDocument.fullname
    strFile = Dir(strFolder & "\*.doc", vbNormal)
    While strFile <> ""
      If strFolder & strFile <> strTgt Then
        Set wdDocSrc = Documents.Open(FileName:=strFolder & "\" & strFile, AddToRecentFiles:=False, Visible:=False)
        With wdDocTgt
          .Characters.Last.InsertBefore vbCr
          .Characters.Last.InsertBreak (wdSectionBreakNextPage)
          With .Sections.Last
            For Each HdFt In .Headers
              With HdFt
                .LinkToPrevious = False
                .range.Text = vbNullString
                .PageNumbers.RestartNumberingAtSection = True
                .PageNumbers.StartingNumber = wdDocSrc.Sections.First.Headers(HdFt.Index).PageNumbers.StartingNumber
              End With
            Next
            For Each HdFt In .Footers
              With HdFt
                .LinkToPrevious = False
                .range.Text = vbNullString
                .PageNumbers.RestartNumberingAtSection = True
                .PageNumbers.StartingNumber = wdDocSrc.Sections.First.Headers(HdFt.Index).PageNumbers.StartingNumber
              End With
            Next
          End With
          Call LayoutTransfer(wdDocTgt, wdDocSrc)
          .range.Characters.Last.FormattedText = wdDocSrc.range.FormattedText
          With .Sections.Last
            For Each HdFt In .Headers
              With HdFt
                .range.FormattedText = wdDocSrc.Sections.Last.Headers(.Index).range.FormattedText
                .range.Characters.Last.Delete
              End With
            Next
            For Each HdFt In .Footers
              With HdFt
                .range.FormattedText = wdDocSrc.Sections.Last.Footers(.Index).range.FormattedText
                .range.Characters.Last.Delete
              End With
            Next
          End With
        End With
        wdDocSrc.Close SaveChanges:=False
      End If
      strFile = Dir()
    Wend
    With wdDocTgt
      ' Save & close the combined document
      .SaveAs FileName:=strFolder & "Forms.docx", FileFormat:=wdFormatXMLDocument, AddToRecentFiles:=False
      ' and/or:
      .SaveAs FileName:=strFolder & "Forms.pdf", FileFormat:=wdFormatPDF, AddToRecentFiles:=False
      .Close SaveChanges:=False
    End With
    Set wdDocSrc = Nothing: Set wdDocTgt = Nothing
    Application.ScreenUpdating = True
End Sub
' ============================================================================================================
Private Function GetFolder() As String
' used by CombineDocument macro by Paul Edstein, keep together in same module
' https://www.msofficeforums.com/word-vba/43339-combine-multiple-word-documents.html

    Dim oFolder As Object
    GetFolder = ""
    Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
    If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
    Set oFolder = Nothing
End Function

Sub LayoutTransfer(wdDocTgt As Document, wdDocSrc As Document)
' works with previous Combine Documents macro from Paul Edstein, keep together
' https://www.msofficeforums.com/word-vba/43339-combine-multiple-word-documents.html
'
    Dim sPageHght As Single, sPageWdth As Single
    Dim sHeaderDist As Single, sFooterDist As Single
    Dim sTMargin As Single, sBMargin As Single
    Dim sLMargin As Single, sRMargin As Single
    Dim sGutter As Single, sGutterPos As Single
    Dim lPaperSize As Long, lGutterStyle As Long
    Dim lMirrorMargins As Long, lVerticalAlignment As Long
    Dim lScnStart As Long, lScnDir As Long
    Dim lOddEvenHdFt As Long, lDiffFirstHdFt As Long
    Dim bTwoPagesOnOne As Boolean, bBkFldPrnt As Boolean
    Dim bBkFldPrnShts As Boolean, bBkFldRevPrnt As Boolean
    Dim lOrientation As Long
    With wdDocSrc.Sections.Last.PageSetup
      lPaperSize = .PaperSize
      lGutterStyle = .GutterStyle
      lOrientation = .Orientation
      lMirrorMargins = .MirrorMargins
      lScnStart = .SectionStart
      lScnDir = .SectionDirection
      lOddEvenHdFt = .OddAndEvenPagesHeaderFooter
      lDiffFirstHdFt = .DifferentFirstPageHeaderFooter
      lVerticalAlignment = .VerticalAlignment
      sPageHght = .PageHeight
      sPageWdth = .PageWidth
      sTMargin = .TopMargin
      sBMargin = .BottomMargin
      sLMargin = .LeftMargin
      sRMargin = .RightMargin
      sGutter = .Gutter
      sGutterPos = .GutterPos
      sHeaderDist = .HeaderDistance
      sFooterDist = .FooterDistance
      bTwoPagesOnOne = .TwoPagesOnOne
      bBkFldPrnt = .BookFoldPrinting
      bBkFldPrnShts = .BookFoldPrintingSheets
      bBkFldRevPrnt = .BookFoldRevPrinting
    End With
    With wdDocTgt.Sections.Last.PageSetup
      .GutterStyle = lGutterStyle
      .MirrorMargins = lMirrorMargins
      .SectionStart = lScnStart
      .SectionDirection = lScnDir
      .OddAndEvenPagesHeaderFooter = lOddEvenHdFt
      .DifferentFirstPageHeaderFooter = lDiffFirstHdFt
      .VerticalAlignment = lVerticalAlignment
      .PageHeight = sPageHght
      .PageWidth = sPageWdth
      .TopMargin = sTMargin
      .BottomMargin = sBMargin
      .LeftMargin = sLMargin
      .RightMargin = sRMargin
      .Gutter = sGutter
      .GutterPos = sGutterPos
      .HeaderDistance = sHeaderDist
      .FooterDistance = sFooterDist
      .TwoPagesOnOne = bTwoPagesOnOne
      .BookFoldPrinting = bBkFldPrnt
      .BookFoldPrintingSheets = bBkFldPrnShts
      .BookFoldRevPrinting = bBkFldRevPrnt
      .PaperSize = lPaperSize
      .Orientation = lOrientation
    End With
End Sub
 
' ============================================================================================================

我用了一个模板,编辑好几遍复制到一个新的Word文档中。 它是这样工作的

Word.Range rng = wordDocTarget.Content;
rng.Collapse(Word.WdCollapseDirection.wdCollapseEnd)
rng.FormattedText = wordDocSource.Content.FormattedText

另一种方法是将整个文件插入到范围/文档中

rng = wordDoc.Range
rng.Collapse(Word.WdCollapseDirection.wdCollapseEnd)
rng.InsertFile(filepath)