删除 excel 单元格中的重复文本
Deleting duplicate text in a cell in excel
我想知道如何删除单元格中重复的 names/text。例如
Jean Donea Jean Doneasee
R.L. Foye R.L. Foyesee
J.E. Zimmer J.E. Zimmersee
R.P. Reed R.P. Reedsee D.E. Munson D.E. Munsonsee
在谷歌搜索时,我偶然发现了一个 macro/code,就像:
Function RemoveDupes1(pWorkRng As Range) As String
'Updateby20140924
Dim xValue As String
Dim xChar As String
Dim xOutValue As String
Set xDic = CreateObject("Scripting.Dictionary")
xValue = pWorkRng.Value
For i = 1 To VBA.Len(xValue)
xChar = VBA.Mid(xValue, i, 1)
If xDic.exists(xChar) Then
Else
xDic(xChar) = ""
xOutValue = xOutValue & xChar
End If
Next
RemoveDupes1 = xOutValue
End Function
宏正在运行,但它正在比较每个字母,如果发现任何重复的字母,则会将其删除。
当我对这些名称使用代码时,结果有点像这样:
Jean Dos
R.L Foyes
J.E Zimers
R.P edsDEMuno
通过查看结果我可以看出它不是我想要的,但我不知道如何更正代码。
所需的输出应如下所示:
Jean Donea
R.L. Foye
J.E. Zimmer
R.P. Reed
有什么建议吗?
提前致谢。
此解决方案假设 'see'(或其他一些三字母字符串)将始终位于单元格值的末尾。如果不是这种情况,那么这将不起作用。
Function RemoveDupeInCell(dString As String) As String
Dim x As Long, ct As Long
Dim str As String
'define str as half the length of the cell, minus the right three characters
str = Trim(Left(dString, WorksheetFunction.RoundUp((Len(dString) - 3) / 2, 0)))
'loop through the entire cell and count the number of instances of str
For x = 1 To Len(dString)
If Mid(dString, x, Len(str)) = str Then ct = ct + 1
Next x
'if it's more than one, set to str, otherwise error
If ct > 1 Then
RemoveDupeInCell = str
Else
RemoveDupeInCell = "#N/A"
End If
End Function
输入
随着图像上的输入:
结果
Debug.Print
输出
正则表达式
可以使用正则表达式在单元格上动态迭代,用作查找工具。所以它只会提取最短的匹配。 \w*( OUTPUT_OF_EXTRACTELEMENT )\w*
,例如:\w*(Jean)\w*
Regex 的引用必须是 enabled。
代码
Function EXTRACTELEMENT(Txt As String, n, Separator As String) As String
On Error GoTo ErrHandler:
EXTRACTELEMENT = Split(Application.Trim(Mid(Txt, 1)), Separator)(n - 1)
Exit Function
ErrHandler:
' error handling code
EXTRACTELEMENT = 0
On Error GoTo 0
End Function
Sub test()
Dim str As String
Dim objMatches As Object
Set objRegExp = CreateObject("VBScript.RegExp") 'New regexp
lastrow = ActiveSheet.Cells(ActiveSheet.Rows.Count, "A").End(xlUp).Row
For Row = 1 To lastrow
str = Range("A" & Row)
F_str = ""
N_Elements = UBound(Split(str, " "))
If N_Elements > 0 Then
For k = 1 To N_Elements + 1
strPattern = "\w*(" & EXTRACTELEMENT(CStr(str), k, " ") & ")\w*"
With objRegExp
.Pattern = strPattern
.Global = True
End With
If objRegExp.test(strPattern) Then
Set objMatches = objRegExp.Execute(str)
If objMatches.Count > 1 Then
If objRegExp.test(F_str) = False Then
F_str = F_str & " " & objMatches(0).Submatches(0)
End If
ElseIf k <= 2 And objMatches.Count = 1 Then
F_str = F_str & " " & objMatches(0).Submatches(0)
End If
End If
Next k
Else
F_str = str
End If
Debug.Print Trim(F_str)
Next Row
End Sub
Note that you can Replace the Debug.Print
to write on the target
cell, if it is column B to Cells(Row,2)=Trim(F_str)
说明
函数
您可以使用此 UDF,它使用 Split Function 获取由空格 (" ") 分隔的元素。所以它可以让每个元素在单元格上进行比较。
循环
它将从 1 循环到每个单元格中的元素数 k
,从 row
1 循环到 lastrow
。
正则表达式
正则表达式用于查找单元格上的匹配项,并使用每个匹配项的最短元素加入新字符串。
我想知道如何删除单元格中重复的 names/text。例如
Jean Donea Jean Doneasee
R.L. Foye R.L. Foyesee
J.E. Zimmer J.E. Zimmersee
R.P. Reed R.P. Reedsee D.E. Munson D.E. Munsonsee
在谷歌搜索时,我偶然发现了一个 macro/code,就像:
Function RemoveDupes1(pWorkRng As Range) As String
'Updateby20140924
Dim xValue As String
Dim xChar As String
Dim xOutValue As String
Set xDic = CreateObject("Scripting.Dictionary")
xValue = pWorkRng.Value
For i = 1 To VBA.Len(xValue)
xChar = VBA.Mid(xValue, i, 1)
If xDic.exists(xChar) Then
Else
xDic(xChar) = ""
xOutValue = xOutValue & xChar
End If
Next
RemoveDupes1 = xOutValue
End Function
宏正在运行,但它正在比较每个字母,如果发现任何重复的字母,则会将其删除。
当我对这些名称使用代码时,结果有点像这样:
Jean Dos
R.L Foyes
J.E Zimers
R.P edsDEMuno
通过查看结果我可以看出它不是我想要的,但我不知道如何更正代码。
所需的输出应如下所示:
Jean Donea
R.L. Foye
J.E. Zimmer
R.P. Reed
有什么建议吗?
提前致谢。
此解决方案假设 'see'(或其他一些三字母字符串)将始终位于单元格值的末尾。如果不是这种情况,那么这将不起作用。
Function RemoveDupeInCell(dString As String) As String
Dim x As Long, ct As Long
Dim str As String
'define str as half the length of the cell, minus the right three characters
str = Trim(Left(dString, WorksheetFunction.RoundUp((Len(dString) - 3) / 2, 0)))
'loop through the entire cell and count the number of instances of str
For x = 1 To Len(dString)
If Mid(dString, x, Len(str)) = str Then ct = ct + 1
Next x
'if it's more than one, set to str, otherwise error
If ct > 1 Then
RemoveDupeInCell = str
Else
RemoveDupeInCell = "#N/A"
End If
End Function
输入
随着图像上的输入:
结果
Debug.Print
输出
正则表达式
可以使用正则表达式在单元格上动态迭代,用作查找工具。所以它只会提取最短的匹配。 \w*( OUTPUT_OF_EXTRACTELEMENT )\w*
,例如:\w*(Jean)\w*
Regex 的引用必须是 enabled。
代码
Function EXTRACTELEMENT(Txt As String, n, Separator As String) As String
On Error GoTo ErrHandler:
EXTRACTELEMENT = Split(Application.Trim(Mid(Txt, 1)), Separator)(n - 1)
Exit Function
ErrHandler:
' error handling code
EXTRACTELEMENT = 0
On Error GoTo 0
End Function
Sub test()
Dim str As String
Dim objMatches As Object
Set objRegExp = CreateObject("VBScript.RegExp") 'New regexp
lastrow = ActiveSheet.Cells(ActiveSheet.Rows.Count, "A").End(xlUp).Row
For Row = 1 To lastrow
str = Range("A" & Row)
F_str = ""
N_Elements = UBound(Split(str, " "))
If N_Elements > 0 Then
For k = 1 To N_Elements + 1
strPattern = "\w*(" & EXTRACTELEMENT(CStr(str), k, " ") & ")\w*"
With objRegExp
.Pattern = strPattern
.Global = True
End With
If objRegExp.test(strPattern) Then
Set objMatches = objRegExp.Execute(str)
If objMatches.Count > 1 Then
If objRegExp.test(F_str) = False Then
F_str = F_str & " " & objMatches(0).Submatches(0)
End If
ElseIf k <= 2 And objMatches.Count = 1 Then
F_str = F_str & " " & objMatches(0).Submatches(0)
End If
End If
Next k
Else
F_str = str
End If
Debug.Print Trim(F_str)
Next Row
End Sub
Note that you can Replace the
Debug.Print
to write on the target cell, if it is column B toCells(Row,2)=Trim(F_str)
说明
函数
您可以使用此 UDF,它使用 Split Function 获取由空格 (" ") 分隔的元素。所以它可以让每个元素在单元格上进行比较。
循环
它将从 1 循环到每个单元格中的元素数 k
,从 row
1 循环到 lastrow
。
正则表达式
正则表达式用于查找单元格上的匹配项,并使用每个匹配项的最短元素加入新字符串。