如果引号之间的文本少于 3 个单词,则删除括号之间的文本
Delete text between parentheses if text between quotes has fewer than 3 words
我有一份包含多个段落的文件。我想遍历文档的每一段并检查引号中是否有单词。如果引号中的单词少于 3 个,我想删除所有出现在括号内的文本。
想象一下下面的段落。
The information that you need to include depends on what type of source the material comes from. For "printed material", you normally only need to include the author (s) (or article title if there is no author) and year of publication (never the month or day) in your reference. When citing a specific part of a source (for example, a direct quotation), you will also want to indicate the page number (s) or other designation (chapter, figure, table, equation, etc.). For Internet sources, paragraph numbers can be used when page numbers are not available.
因为短语 "printed material" 只包含 2 个词,所以我想删除括号中的所有词和括号本身。
我如何在 Microsoft Word 中使用 VBA 来做这样的事情?我已经发布了一些失败的代码,以表明这是一个真诚的问题。
Sub RemoveUnnecesaryTexts()
Dim doc As Document
Dim para As Paragraph
Set doc = ActiveDocument
For Each para In doc.Paragraphs
Application.ScreenUpdating = False
Selection.HomeKey Unit:=wdStory
With Selection.Find
.ClearFormatting
.Text = "(""<*>"")"
End With
If Selection.Find.Execute Then
Selection.Parent.Select
With Selection.Find
.Text = "\((<*>)\)"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
End If
Next para
End Sub
这段代码没有检查引文中的字数,因为我还没有成功地做到这一点。但至少它会让你了解我正在尝试做什么。关于我在这里做错了什么有什么想法吗?
伪代码:
Iterate through the paragraphs.
Regex Match the quoted sub string "...." and then count the spaces in the match
If spaces < 2 then
Second Regex match all occurrences of (....) and delete all matches in the paragraph
Else
Continue to next paragraph
请注意,这取决于只有一个引用 sub-string 存在。如果不是这种情况,您将必须实施 select 正确引号的逻辑。
编辑:我远不是正则表达式专家,但匹配可以像这样简单:
String match1 = "/".*/""
String match2 = "/(.*/)"
这些将贪婪地匹配您想要的模式,这意味着它们将匹配“1”、“12345”、(123456....1223447748557),以及空字符串“”和空括号 ()。如果不需要空瓶,则将“*”切换为“+”。
我还没有对此进行测试,我只是破解正则表达式,直到它们按照我的要求运行。此外,您需要处理(或忽略)引号内出现的括号。
此外,对于您选择在其中实现此功能的任何语言,您可以逐个字符地遍历匹配的引号子字符串并在字符为 space 时递增计数器,或者,更好:请参阅如果您的字符串库中有一个函数可以为您执行此操作。
最后,对于某些语言,您应该有一个 String.replace() 函数,在这种情况下,我将遍历每个括号匹配项并将匹配项输入函数,如下所示 Paragraph.replace( matches[i], ""), 它只是用一个空字符串替换你的匹配。
编辑:
哦。我不知何故错过了标题的 VBA 部分。然后你需要处理 Word 的 object 模型。据我所知,有一个文档 object 应该 return 一个段落 [] collection 你可以迭代。我知道 VBA 有一个正则表达式 class 您可以使用,字符串方法应该可以正常工作。不确定 VBA 是否有 'int HowManyTimesDoesThisCharAppearInThisString(String search, char target)' 但自己实现它并不难。可以在 MSDN 上查看 String 文档。这是我在使用 M$ 代码时唯一最喜欢的事情,其他人遇到与您相同的问题的可能性高于平均水平,而且 MSDN 的文档也相当不错。
另外,我发现这个,可能对你有帮助:Counting the Words in a String
该方法实际上更简单,它只是在 spaces 上拆分字符串并计算结果数组的长度。
基于我之前的回答:Format number between markers as subscript
这对我有用:
Dim wd As Document
Dim para As Paragraph
Dim rOpeningQuote As Range
Dim rClosingQuote As Range
Dim rBewteenQuotes As Range
Dim quoteFound As Boolean
Dim nWordsBetweenQuotes As Long
Dim rOpeningParenthesis As Range
Dim rClosingParenthesis As Range
Dim openingParenthesisFound As Boolean
Set wd = ActiveDocument
For Each para In wd.Paragraphs
para.Range.Select
'Look for opening quote
quoteFound = Selection.Find.Execute("""")
If quoteFound Then
Set rOpeningQuote = Selection.Range
'Look for closing quote
Selection.Find.Execute """"
Set rClosingQuote = Selection.Range
'Count words between the two
Set rBewteenQuotes = wd.Range(rOpeningQuote.End, rClosingQuote.Start)
nWordsBetweenQuotes = UBound(Split(rBewteenQuotes.Text, " ")) + 1
If nWordsBetweenQuotes < 3 Then
para.Range.Select
Do
'Look for opening parenthesis
openingParenthesisFound = Selection.Find.Execute("(")
If Not openingParenthesisFound Then Exit Do
Set rOpeningParenthesis = Selection.Range
'Look for closing parenthesis
wd.Range(Selection.End, para.Range.End).Select
Selection.Find.Execute ")"
Set rClosingParenthesis = Selection.Range
'Delete and select rest of paragraph for next iteration
wd.Range(rOpeningParenthesis.Start, rClosingParenthesis.End).Delete
wd.Range(Selection.End, para.Range.End).Select
Loop
End If
Else
'No quote found in this paragraph. Do nothing.
End If
Next para
结果:
请注意,删除括号中的位会留下多个连续的空格(上图中粉红色突出显示的示例)。不确定你是否想解决这个问题,但如果是的话,尝试一下,如果遇到问题,请提出一个新问题。
我有一份包含多个段落的文件。我想遍历文档的每一段并检查引号中是否有单词。如果引号中的单词少于 3 个,我想删除所有出现在括号内的文本。
想象一下下面的段落。
The information that you need to include depends on what type of source the material comes from. For "printed material", you normally only need to include the author (s) (or article title if there is no author) and year of publication (never the month or day) in your reference. When citing a specific part of a source (for example, a direct quotation), you will also want to indicate the page number (s) or other designation (chapter, figure, table, equation, etc.). For Internet sources, paragraph numbers can be used when page numbers are not available.
因为短语 "printed material" 只包含 2 个词,所以我想删除括号中的所有词和括号本身。
我如何在 Microsoft Word 中使用 VBA 来做这样的事情?我已经发布了一些失败的代码,以表明这是一个真诚的问题。
Sub RemoveUnnecesaryTexts()
Dim doc As Document
Dim para As Paragraph
Set doc = ActiveDocument
For Each para In doc.Paragraphs
Application.ScreenUpdating = False
Selection.HomeKey Unit:=wdStory
With Selection.Find
.ClearFormatting
.Text = "(""<*>"")"
End With
If Selection.Find.Execute Then
Selection.Parent.Select
With Selection.Find
.Text = "\((<*>)\)"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
End If
Next para
End Sub
这段代码没有检查引文中的字数,因为我还没有成功地做到这一点。但至少它会让你了解我正在尝试做什么。关于我在这里做错了什么有什么想法吗?
伪代码:
Iterate through the paragraphs.
Regex Match the quoted sub string "...." and then count the spaces in the match
If spaces < 2 then
Second Regex match all occurrences of (....) and delete all matches in the paragraph
Else
Continue to next paragraph
请注意,这取决于只有一个引用 sub-string 存在。如果不是这种情况,您将必须实施 select 正确引号的逻辑。
编辑:我远不是正则表达式专家,但匹配可以像这样简单:
String match1 = "/".*/""
String match2 = "/(.*/)"
这些将贪婪地匹配您想要的模式,这意味着它们将匹配“1”、“12345”、(123456....1223447748557),以及空字符串“”和空括号 ()。如果不需要空瓶,则将“*”切换为“+”。
我还没有对此进行测试,我只是破解正则表达式,直到它们按照我的要求运行。此外,您需要处理(或忽略)引号内出现的括号。
此外,对于您选择在其中实现此功能的任何语言,您可以逐个字符地遍历匹配的引号子字符串并在字符为 space 时递增计数器,或者,更好:请参阅如果您的字符串库中有一个函数可以为您执行此操作。
最后,对于某些语言,您应该有一个 String.replace() 函数,在这种情况下,我将遍历每个括号匹配项并将匹配项输入函数,如下所示 Paragraph.replace( matches[i], ""), 它只是用一个空字符串替换你的匹配。
编辑: 哦。我不知何故错过了标题的 VBA 部分。然后你需要处理 Word 的 object 模型。据我所知,有一个文档 object 应该 return 一个段落 [] collection 你可以迭代。我知道 VBA 有一个正则表达式 class 您可以使用,字符串方法应该可以正常工作。不确定 VBA 是否有 'int HowManyTimesDoesThisCharAppearInThisString(String search, char target)' 但自己实现它并不难。可以在 MSDN 上查看 String 文档。这是我在使用 M$ 代码时唯一最喜欢的事情,其他人遇到与您相同的问题的可能性高于平均水平,而且 MSDN 的文档也相当不错。
另外,我发现这个,可能对你有帮助:Counting the Words in a String 该方法实际上更简单,它只是在 spaces 上拆分字符串并计算结果数组的长度。
基于我之前的回答:Format number between markers as subscript
这对我有用:
Dim wd As Document
Dim para As Paragraph
Dim rOpeningQuote As Range
Dim rClosingQuote As Range
Dim rBewteenQuotes As Range
Dim quoteFound As Boolean
Dim nWordsBetweenQuotes As Long
Dim rOpeningParenthesis As Range
Dim rClosingParenthesis As Range
Dim openingParenthesisFound As Boolean
Set wd = ActiveDocument
For Each para In wd.Paragraphs
para.Range.Select
'Look for opening quote
quoteFound = Selection.Find.Execute("""")
If quoteFound Then
Set rOpeningQuote = Selection.Range
'Look for closing quote
Selection.Find.Execute """"
Set rClosingQuote = Selection.Range
'Count words between the two
Set rBewteenQuotes = wd.Range(rOpeningQuote.End, rClosingQuote.Start)
nWordsBetweenQuotes = UBound(Split(rBewteenQuotes.Text, " ")) + 1
If nWordsBetweenQuotes < 3 Then
para.Range.Select
Do
'Look for opening parenthesis
openingParenthesisFound = Selection.Find.Execute("(")
If Not openingParenthesisFound Then Exit Do
Set rOpeningParenthesis = Selection.Range
'Look for closing parenthesis
wd.Range(Selection.End, para.Range.End).Select
Selection.Find.Execute ")"
Set rClosingParenthesis = Selection.Range
'Delete and select rest of paragraph for next iteration
wd.Range(rOpeningParenthesis.Start, rClosingParenthesis.End).Delete
wd.Range(Selection.End, para.Range.End).Select
Loop
End If
Else
'No quote found in this paragraph. Do nothing.
End If
Next para
结果:
请注意,删除括号中的位会留下多个连续的空格(上图中粉红色突出显示的示例)。不确定你是否想解决这个问题,但如果是的话,尝试一下,如果遇到问题,请提出一个新问题。