VBA 不带格式替换文本
VBA replacing text without formatting
在 docx 文件中,我有很多文本可能包含数学对象 (mml) 的特定符号。这些符号包含在特定的分隔符(MATHSTART 和 MATHEND)中,以便于查找。
示例:
MATHSTART<math xmlns="http://www.w3.org/1998/Math/MathML" display="block" alttext="x equals StartFraction negative b plus-or-minus StartRoot b squared minus 4 a c EndRoot Over 2 a EndFraction">
<semantics>
<mrow>
<mi>x</mi>
<mo>=</mo>
<mrow class="MJX-TeXAtom-ORD">
<mfrac>
<mrow>
<mo>−<!-- − --></mo>
<mi>b</mi>
<mo>±<!-- ± --></mo>
<mrow class="MJX-TeXAtom-ORD">
<msqrt>
<msup>
<mi>b</mi>
<mrow class="MJX-TeXAtom-ORD">
<mn>2</mn>
</mrow>
</msup>
<mo>−<!-- − --></mo>
<mn>4</mn>
<mi>a</mi>
<mi>c</mi>
</msqrt>
</mrow>
</mrow>
<mrow>
<mn>2</mn>
<mi>a</mi>
</mrow>
</mfrac>
</mrow>
</mrow>
<annotation encoding="application/x-tex">x={-b\pm {\sqrt {b^{2}-4ac}} \over 2a}</annotation>
</semantics>
</math>MATHEND
我现在想使用 marcos 来获取所有这些部分并从文档中剪切它们,然后在没有分隔符(MATHSTART 和 MATHEND)并且没有格式(类似于 wdFormatPlainText
).期望的结果是 docx 中的数学方程式。
我目前拥有的:
Dim regex As Object, wholeDocText As String
Set regex = CreateObject("VBScript.RegExp")
Selection.WholeStory
Selection.Copy
wholeDocText = Selection.Text
With regex
.Pattern = "MATHSTART[.\s\S]*?MATHEND"
.Global = True
End With
Set matches = regex.Execute(wholeDocText)
For Each match In matches
s1 = Replace(match.Value, "MATHSTART", "")
s1 = Replace(s1, "MATHEND", "")
'select match.Value in the document
'overwrite the selected string with the new one
'sth like: Selection.Text s1(wdFormatPlainText)
Next match
问题是三行注释,不知道怎么实现
如果您只想删除每个 RegEx.Match 中的两个词,以下应该可行:
Sub TestMe()
Dim regex As Object, wholeDocText As String
Set regex = CreateObject("VBScript.RegExp")
Selection.WholeStory
Selection.Copy
wholeDocText = Selection.Text
With regex
.Pattern = "MATHSTART[.\s\S]*?MATHEND"
.Global = True
End With
Set matches = regex.Execute(wholeDocText)
For Each match In matches
s1 = Replace(match, "MATHSTART", "")
s1 = Replace(s1, "MATHEND", "")
match = s1
Next match
End Sub
我删除了第一个 Replace()
中的 .Value
并添加了 match = s1
.
注意:我利用了此
中的搜索和 select 功能
Sub convertMmlToWordField()
Dim StartWord As String, EndWord As String
Dim FindStartRange As Range, FindEndRange As Range
Dim CopyRange As Range, CopyStartRange As Range, CopyEndRange As Range
Set FindStartRange = ActiveDocument.Range
Set FindEndRange = ActiveDocument.Range
Set CopyRange = ActiveDocument.Range
StartWord = "MATHSTART"
EndWord = "MATHEND"
'Starting the Lookup for the starting word
With FindStartRange.find
.Text = StartWord
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
'Execute the Lookup
Do While .Execute
If .Found = True Then
Set CopyStartRange = FindStartRange
CopyStartRange.Select
'Setting the FindEndRange up for the remainder of the document beginning from the end of the StartWord
FindEndRange.Start = CopyStartRange.End
FindEndRange.End = ActiveDocument.Content.End
FindEndRange.Select
'Setting the Find to look for the End Word
With FindEndRange.find
.Text = EndWord
.Execute
If .Found = True Then
Set CopyEndRange = FindEndRange
CopyEndRange.Select
End If
End With
'Selecting the copy range
CopyRange.Start = CopyStartRange.Start
CopyRange.End = CopyEndRange.End
CopyRange.Select
s1 = Replace(CopyRange.Text, "MATHSTART", "")
s1 = Replace(s1, "MATHEND", "")
Selection.Text = s1
Selection.Copy
Selection.PasteAndFormat (wdFormatPlainText)
End If 'Ending the If Find1stRange .Found = True
Loop 'Ending the Do While .Execute Loop
End With 'Ending the Find1stRange.Find With Statement
End Sub
实际工作发生在这部分:
s1 = Replace(CopyRange.Text, "MATHSTART", "")
s1 = Replace(s1, "MATHEND", "")
Selection.Text = s1
Selection.Copy
Selection.PasteAndFormat (wdFormatPlainText)
CopyRange 是从 MATHSTART 到 MATHEND(含)的整个字符串。
所以你可以只切割那些部分。
剩下的是实际的 MML,您现在可以将其输入选择对象(替换文档中的字符串),然后不带格式地复制和粘贴它。
在 docx 文件中,我有很多文本可能包含数学对象 (mml) 的特定符号。这些符号包含在特定的分隔符(MATHSTART 和 MATHEND)中,以便于查找。
示例:
MATHSTART<math xmlns="http://www.w3.org/1998/Math/MathML" display="block" alttext="x equals StartFraction negative b plus-or-minus StartRoot b squared minus 4 a c EndRoot Over 2 a EndFraction">
<semantics>
<mrow>
<mi>x</mi>
<mo>=</mo>
<mrow class="MJX-TeXAtom-ORD">
<mfrac>
<mrow>
<mo>−<!-- − --></mo>
<mi>b</mi>
<mo>±<!-- ± --></mo>
<mrow class="MJX-TeXAtom-ORD">
<msqrt>
<msup>
<mi>b</mi>
<mrow class="MJX-TeXAtom-ORD">
<mn>2</mn>
</mrow>
</msup>
<mo>−<!-- − --></mo>
<mn>4</mn>
<mi>a</mi>
<mi>c</mi>
</msqrt>
</mrow>
</mrow>
<mrow>
<mn>2</mn>
<mi>a</mi>
</mrow>
</mfrac>
</mrow>
</mrow>
<annotation encoding="application/x-tex">x={-b\pm {\sqrt {b^{2}-4ac}} \over 2a}</annotation>
</semantics>
</math>MATHEND
我现在想使用 marcos 来获取所有这些部分并从文档中剪切它们,然后在没有分隔符(MATHSTART 和 MATHEND)并且没有格式(类似于 wdFormatPlainText
).期望的结果是 docx 中的数学方程式。
我目前拥有的:
Dim regex As Object, wholeDocText As String
Set regex = CreateObject("VBScript.RegExp")
Selection.WholeStory
Selection.Copy
wholeDocText = Selection.Text
With regex
.Pattern = "MATHSTART[.\s\S]*?MATHEND"
.Global = True
End With
Set matches = regex.Execute(wholeDocText)
For Each match In matches
s1 = Replace(match.Value, "MATHSTART", "")
s1 = Replace(s1, "MATHEND", "")
'select match.Value in the document
'overwrite the selected string with the new one
'sth like: Selection.Text s1(wdFormatPlainText)
Next match
问题是三行注释,不知道怎么实现
如果您只想删除每个 RegEx.Match 中的两个词,以下应该可行:
Sub TestMe()
Dim regex As Object, wholeDocText As String
Set regex = CreateObject("VBScript.RegExp")
Selection.WholeStory
Selection.Copy
wholeDocText = Selection.Text
With regex
.Pattern = "MATHSTART[.\s\S]*?MATHEND"
.Global = True
End With
Set matches = regex.Execute(wholeDocText)
For Each match In matches
s1 = Replace(match, "MATHSTART", "")
s1 = Replace(s1, "MATHEND", "")
match = s1
Next match
End Sub
我删除了第一个 Replace()
中的 .Value
并添加了 match = s1
.
注意:我利用了此
Sub convertMmlToWordField()
Dim StartWord As String, EndWord As String
Dim FindStartRange As Range, FindEndRange As Range
Dim CopyRange As Range, CopyStartRange As Range, CopyEndRange As Range
Set FindStartRange = ActiveDocument.Range
Set FindEndRange = ActiveDocument.Range
Set CopyRange = ActiveDocument.Range
StartWord = "MATHSTART"
EndWord = "MATHEND"
'Starting the Lookup for the starting word
With FindStartRange.find
.Text = StartWord
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
'Execute the Lookup
Do While .Execute
If .Found = True Then
Set CopyStartRange = FindStartRange
CopyStartRange.Select
'Setting the FindEndRange up for the remainder of the document beginning from the end of the StartWord
FindEndRange.Start = CopyStartRange.End
FindEndRange.End = ActiveDocument.Content.End
FindEndRange.Select
'Setting the Find to look for the End Word
With FindEndRange.find
.Text = EndWord
.Execute
If .Found = True Then
Set CopyEndRange = FindEndRange
CopyEndRange.Select
End If
End With
'Selecting the copy range
CopyRange.Start = CopyStartRange.Start
CopyRange.End = CopyEndRange.End
CopyRange.Select
s1 = Replace(CopyRange.Text, "MATHSTART", "")
s1 = Replace(s1, "MATHEND", "")
Selection.Text = s1
Selection.Copy
Selection.PasteAndFormat (wdFormatPlainText)
End If 'Ending the If Find1stRange .Found = True
Loop 'Ending the Do While .Execute Loop
End With 'Ending the Find1stRange.Find With Statement
End Sub
实际工作发生在这部分:
s1 = Replace(CopyRange.Text, "MATHSTART", "")
s1 = Replace(s1, "MATHEND", "")
Selection.Text = s1
Selection.Copy
Selection.PasteAndFormat (wdFormatPlainText)
CopyRange 是从 MATHSTART 到 MATHEND(含)的整个字符串。 所以你可以只切割那些部分。 剩下的是实际的 MML,您现在可以将其输入选择对象(替换文档中的字符串),然后不带格式地复制和粘贴它。