正则表达式将超过 3 个字符的单词的首字母大写,并在连字符和撇号之后
Regex capitalise first letter of words more than 3 chars, and after hyphens and apostrophes
基本上...
我正在尝试对字符串执行自定义大写;我花了几个小时与 Regex 战斗,但无济于事...
要求:
I need to capitalise:
- If first word >3 chars: First letter of the first word.
- If last word >3 chars: First letter of the last word.
- Always: First letter following a hyphen or apostrophe.
(The final regex needs to be implementable into VB6)
Examples:
anne-marie > Anne-Marie // 1st letter of first word + after hyphen
vom schattenreich > vom Schattenreich // 1st letter of last word
will it work-or-not > Will it Work-Or-Not // 1st letter of outer words + after hyphens
seth o'callaghan > Seth O'Callaghan // 1st letter of outer words + after apostrophe
first and last only > First and last Only // 1st letter of outer words (excl. middle)
sarah jane o'brien > Sarah jane O'Brien // 1st letter of outer words (excl. middle)
到目前为止我得到了什么:
我拼凑了两个正则表达式,它们之间几乎可以完成我需要的。然而,我试图将它们合并成一个正则表达式或将其写成一个正则表达式,但都失败了。
我的主要困难是我的部分大写仅适用于第一个和最后一个单词,而标点符号特定的大写需要适用于整个字符串。但我对正则表达式的了解还不够,无法确定是否可以使用一个表达式。
我的正则表达式:
First letter of First and Last words 但不限制超过 3 个字符的单词,并且不处理完整的字符串标点符号大写
^([a-zA-Z]).*\s([a-zA-Z])[a-zA-Z-]+$
First letter of all words, and after punctuation, where more than 3 chars 但不排除中间词,也不处理末尾的标点符号
(\b[a-zA-Z](?=[a-zA-Z-']{3}))
问题
How I can combine these two regex's to meet my requirements, or correct them enough that they can be used separately? Alternatively provide a different regex that meets the requirements.
参考/相关来源material:
Regex capitalize first letter every word, also after a special character like a dash
First word and first letter of last word of string with Regex
这是我的一种正则表达式方法:
Sub ReplaceAndTurnUppercase()
Dim reg As RegExp
Dim res As String
Set reg = New RegExp
With reg
.Pattern = "^[a-z](?=[a-zA-Z'-]{3})|\b[a-zA-Z](?=[a-zA-Z'-]{3,}$)|['-][a-z]"
.Global = True
.MultiLine = True
End With
s = "anne-marie" & vbCrLf & "vom schattenreich" & vbCrLf & "will it work-or-not" & vbCrLf & "seth o'callaghan" & vbCrLf & "first and last only" & vbCrLf & "sarah jane o'brien"
res = s
For Each Match In reg.Execute(s)
If Len(Match.Value) > 0 Then
res = Left(res, Match.FirstIndex) & UCase(Match.Value) & Mid(res, Match.FirstIndex + Len(Match.Value) + 1)
End If
Next Match
Debug.Print res ' Demo part
End Sub
我使用的正则表达式是 ^[a-z](?=[a-zA-Z'-]{3})|\b[a-z](?=[a-zA-Z'-]{3,}$)|['-][a-z]
。由于所有消耗的字符只是我们想要转为大写或 hyphen/apostrophe 的字母,我们可以将它们全部转为大写而无需关心捕获其中任何一个。
正则表达式匹配 3 个选项:
^[a-z](?=[a-zA-Z'-]{3})
- 字符串的开头(在我的例子中,是我使用 Multiline=True
后的行)后跟一个小写 ASCII 字母(已使用,稍后将大写),后面有 3 个字符, 字母或 '
或 -
(未消耗,在前瞻中)
\b[a-z](?=[a-zA-Z'-]{3,}$)
- 单词边界 \b
后跟小写 ASCII 字母(已消耗)后跟 3 个或更多字母或 '
或 -
直到结尾字符串(在我的例子中是行)
['-][a-z]
- 匹配 '
或 -
然后是小写字母(字符串中的任何位置)。
res = Left(res, match.FirstIndex) & UCase(match.Value) & Mid(res, match.FirstIndex + Len(match.Value) + 1)
行完成了这项工作:它只是获取字符串的一部分直到找到的索引,然后添加修改后的文本,并附加其余部分。
基本上...
我正在尝试对字符串执行自定义大写;我花了几个小时与 Regex 战斗,但无济于事...
要求:
I need to capitalise:
- If first word >3 chars: First letter of the first word.
- If last word >3 chars: First letter of the last word.
- Always: First letter following a hyphen or apostrophe.
(The final regex needs to be implementable into VB6)
Examples:
anne-marie > Anne-Marie // 1st letter of first word + after hyphen
vom schattenreich > vom Schattenreich // 1st letter of last word
will it work-or-not > Will it Work-Or-Not // 1st letter of outer words + after hyphens
seth o'callaghan > Seth O'Callaghan // 1st letter of outer words + after apostrophe
first and last only > First and last Only // 1st letter of outer words (excl. middle)
sarah jane o'brien > Sarah jane O'Brien // 1st letter of outer words (excl. middle)
到目前为止我得到了什么:
我拼凑了两个正则表达式,它们之间几乎可以完成我需要的。然而,我试图将它们合并成一个正则表达式或将其写成一个正则表达式,但都失败了。
我的主要困难是我的部分大写仅适用于第一个和最后一个单词,而标点符号特定的大写需要适用于整个字符串。但我对正则表达式的了解还不够,无法确定是否可以使用一个表达式。
我的正则表达式:
First letter of First and Last words 但不限制超过 3 个字符的单词,并且不处理完整的字符串标点符号大写
^([a-zA-Z]).*\s([a-zA-Z])[a-zA-Z-]+$
First letter of all words, and after punctuation, where more than 3 chars 但不排除中间词,也不处理末尾的标点符号
(\b[a-zA-Z](?=[a-zA-Z-']{3}))
问题
How I can combine these two regex's to meet my requirements, or correct them enough that they can be used separately? Alternatively provide a different regex that meets the requirements.
参考/相关来源material:
Regex capitalize first letter every word, also after a special character like a dash
First word and first letter of last word of string with Regex
这是我的一种正则表达式方法:
Sub ReplaceAndTurnUppercase()
Dim reg As RegExp
Dim res As String
Set reg = New RegExp
With reg
.Pattern = "^[a-z](?=[a-zA-Z'-]{3})|\b[a-zA-Z](?=[a-zA-Z'-]{3,}$)|['-][a-z]"
.Global = True
.MultiLine = True
End With
s = "anne-marie" & vbCrLf & "vom schattenreich" & vbCrLf & "will it work-or-not" & vbCrLf & "seth o'callaghan" & vbCrLf & "first and last only" & vbCrLf & "sarah jane o'brien"
res = s
For Each Match In reg.Execute(s)
If Len(Match.Value) > 0 Then
res = Left(res, Match.FirstIndex) & UCase(Match.Value) & Mid(res, Match.FirstIndex + Len(Match.Value) + 1)
End If
Next Match
Debug.Print res ' Demo part
End Sub
我使用的正则表达式是 ^[a-z](?=[a-zA-Z'-]{3})|\b[a-z](?=[a-zA-Z'-]{3,}$)|['-][a-z]
。由于所有消耗的字符只是我们想要转为大写或 hyphen/apostrophe 的字母,我们可以将它们全部转为大写而无需关心捕获其中任何一个。
正则表达式匹配 3 个选项:
^[a-z](?=[a-zA-Z'-]{3})
- 字符串的开头(在我的例子中,是我使用Multiline=True
后的行)后跟一个小写 ASCII 字母(已使用,稍后将大写),后面有 3 个字符, 字母或'
或-
(未消耗,在前瞻中)\b[a-z](?=[a-zA-Z'-]{3,}$)
- 单词边界\b
后跟小写 ASCII 字母(已消耗)后跟 3 个或更多字母或'
或-
直到结尾字符串(在我的例子中是行)['-][a-z]
- 匹配'
或-
然后是小写字母(字符串中的任何位置)。
res = Left(res, match.FirstIndex) & UCase(match.Value) & Mid(res, match.FirstIndex + Len(match.Value) + 1)
行完成了这项工作:它只是获取字符串的一部分直到找到的索引,然后添加修改后的文本,并附加其余部分。