如何从多列中提取唯一值并使用它们填充一列?
How do I extract unique values from multiple columns and use them to populate one column?
我有一个包含大量数据的大型 table,但我正在查看的是其中的六列 table - 一起从事特定工作的人员的姓名。像这样:
+-------+--------+--------+-------+--------+-------+
| Name1 | Name2 | Name3 | Name4 | Name5 | Name6 |
+-------+--------+--------+-------+--------+-------+
| Rod | Jane | | | | |
| Jane | Freddy | Peter | Paul | | |
| Paul | | | | | |
| Mary | Jane | Rod | Peter | Freddy | Paul |
| Paul | Rod | Freddy | | | |
+-------+--------+--------+-------+--------+-------+
最后我想得到的是这个(在另一个 sheet 上):
+--------+
| Name |
+--------+
| Rod |
| Jane |
| Freddy |
| Peter |
| Paul |
| Mary |
+--------+
我希望能够识别这六列中的所有唯一条目,然后将它们填充到不同的 sheet。我的第一个想法是用公式来做,这很有效(我在 MATCH 部分使用了 INDEX MATCH 和 COUNTIF),但是 table 中有 11000 条记录和可能涉及的 1200 条不同的名称,并且它花了一天的大部分时间来处理。我想,希望,使用 VBA 会使它 运行 更快。
我查看了许多可能的答案。首先,我去了这里: Populate unique values into a VBA array from Excel ,并查看了 brettdj 的答案(因为我有点理解它的去向),最后得到以下代码:
Dim X
Dim objDict As Object
Dim lngRow As Long
Sheets("Data").Select
Set objDict = CreateObject("Scripting.Dictionary")
X = Application.Transpose(Range([be2], Cells(Rows.Count, "BE").End(xlUp)))
For lngRow = 1 To UBound(X, 1)
objDict(X(lngRow)) = 1
Next
Sheets("Crew").Select
Range("A2:A" & objDict.Count) = Application.Transpose(objDict.keys)
End Sub
对于一列(BE 是上面 table 中的 Name1 列 - Data 是存储数据的 sheet,Crew 是 sheet 其中我想要独特的价值)。但是我终其一生都无法弄清楚如何让它从多个列(BE 到 BJ)中获取值。
然后我尝试了这个,源自 Jeremy Thompson 在 中的回答:
Sheets("Data").Select
Range("BE:BJ").AdvancedFilter Action:=xlFilterCopy, CopyToRange:=Sheets("Crew").Range("A:A"), Unique:=True
但同样,我无法将多列信息合二为一。第三次尝试,我查看了来自 How to extract unique values from two columns Excel VBA 的加里学生的回答并尝试了这个:
Dim Na As Long, Nc As Long, Ne As Long
Dim i As Long
Na = Sheets("Stroke Data").Cells(Rows.Count, "BE").End(xlUp).Row
Nc = Sheets("Stroke Data").Cells(Rows.Count, "BF").End(xlUp).Row
Ne = 1
For i = 1 To Na
Cells(Ne, "E").Value = Cells(i, "A").Value
Ne = Ne + 1
Next i
For i = 1 To Na
Cells(Ne, "E").Value = Cells(i, "C").Value
Ne = Ne + 1
Next i
Sheets("Fail").Range("A:A").RemoveDuplicates Columns:=1, Header:=xlNo
(在那一列中只尝试了两列,看看我是否可以那样解决,但是不行)
我真的很茫然。正如您可能从上面看到的那样,我正在疯狂地四处乱窜,并试图从三个不同的角度来解决这个问题,但一无所获。我觉得必须有办法让第一个成功,如果没有别的,因为它 nearly 成功了。但我不明白。
我想我可以 运行 它用于四个单独的列,然后有一个将四个合并为一个的过程。但即使那样,我也不确定如何删除重复项(正如您在上面的 table 中看到的那样,名称可以出现在任何列中)。
只要我最终能得到一个包含唯一名称列表的列,并且不需要花费数小时来处理,我想我真的不介意我是如何到达那里的。
这有点冗长,但对您的示例数据有用。 (可能需要调整初始 rng
的设置方式)。
Sub unique_names()
Dim rng As Range
Set rng = ActiveSheet.UsedRange
Dim col As Range, cel As Range
Dim names() As Variant
ReDim names(rng.Cells.Count)
Dim i As Long
i = 0
'First, let's add all the names to the array
For Each col In rng.Columns
For Each cel In col.Cells
If cel.Value <> "" Then
names(i) = cel.Value
i = i + 1
End If
Next cel
Next col
' Now, extract unique names from the array
Dim arr As New Collection, a
Set arr = unique_values(names)
For i = 1 To arr.Count
Worksheets("Sheet1").Cells(i, 10) = arr(i)
Next
End Sub
Private Function unique_values(iArr As Variant) As Collection
'
Dim arr As New Collection, a
On Error Resume Next
For Each a In iArr
arr.Add a, a
Next
Set unique_values = arr
End Function
这是一种使用字典的方法。只需指定要搜索的范围,RangeToDictionary
函数就会完成剩下的工作。我假设您不想包含空白,所以我删除了那些。
Private Function RangeToDictionary(MyRange As Range) As Object
If MyRange Is Nothing Then Exit Function
If MyRange.Cells.Count < 1 Then Exit Function
Dim cell As Range
Dim dict As Object: Set dict = CreateObject("Scripting.Dictionary")
For Each cell In MyRange
If Not dict.exists(Trim$(cell.Value2)) And Trim$(cell.Value2) <> vbNullString Then dict.Add cell.Value2, cell.Value2
Next
Set RangeToDictionary = dict
End Function
Sub Example()
Dim dict As Object
Dim rng As Range:Set rng = ThisWorkbook.Sheets("Sheet1").Range("A1:f5")
Dim outsheet As Worksheet:Set outsheet = ThisWorkbook.Sheets("Sheet2")
Set dict = RangeToDictionary(rng)
outsheet.Range(outsheet.Cells(1, 1), outsheet.Cells(dict.Count, 1)) = Application.Transpose(dict.items())
End Sub
这将提示您 select 一个范围(可以通过按住 CTRL select 一个不连续的范围),然后将从 selected 范围中提取唯一值并在新的 sheet:
上输出结果
Sub tgr()
Dim wb As Workbook
Dim wsDest As Worksheet
Dim rData As Range
Dim rArea As Range
Dim aData As Variant
Dim i As Long, j As Long
Dim hUnq As Object
'Prompt to select range. Uniques will be extracted from the range selected.
'Can select a non-contiguous range by holding CTRL
On Error Resume Next
Set rData = Application.InputBox("Select range of names where unique names will be extracted:", "Data Selection", Selection.Address, Type:=8)
On Error GoTo 0
If rData Is Nothing Then Exit Sub 'Pressed cancel
Set hUnq = CreateObject("Scripting.Dictionary")
For Each rArea In rData.Areas
If rArea.Cells.Count = 1 Then
ReDim aData(1 To 1, 1 To 1)
aData(1, 1) = rArea.Value
Else
aData = rArea.Value
End If
For i = 1 To UBound(aData, 1)
For j = 1 To UBound(aData, 2)
If Not hUnq.Exists(aData(i, j)) And Len(Trim(aData(i, j))) > 0 Then hUnq(Trim(aData(i, j))) = Trim(aData(i, j))
Next j
Next i
Next rArea
Set wb = rData.Parent.Parent 'First parent is the range's worksheet, second parent is the worksheet's workbook
Set wsDest = wb.Sheets.Add(After:=wb.Sheets(wb.Sheets.Count))
wsDest.Range("A1").Resize(hUnq.Count).Value = Application.Transpose(hUnq.Items)
End Sub
假设您 Excel 2016 年及以后,您可以使用 Power Query 执行此操作。在 Data > Get & Transform 中将您的数据范围转换为 table、select table、select "From Table" 中的一个单元格,然后将以下代码粘贴到Power Query 编辑器的高级编辑器(将 Table3 更改为您的 table 最终名称)。
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Name1", type text}, {"Name2", type text}, {"Name3", type text}, {"Name4", type text}, {"Name5", type text}, {"Name6", type text}}),
#"Replaced Value" = Table.ReplaceValue(#"Changed Type"," ","",Replacer.ReplaceText,{"Name1", "Name2", "Name3", "Name4", "Name5", "Name6"}),
#"Added Custom" = Table.AddColumn(#"Replaced Value", "Text.Combine", each Text.Combine({[#"Name1"],[#"Name2"],[#"Name3"],[#"Name4"],[#"Name5"],[#"Name6"]},";")),
#"Replaced Value1" = Table.ReplaceValue(#"Added Custom",";;","",Replacer.ReplaceText,{"Text.Combine"}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value1", {{"Text.Combine", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Text.Combine"),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Text.Combine", type text}}),
#"Removed Duplicates" = Table.Distinct(#"Changed Type1", {"Text.Combine"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Duplicates", each ([Text.Combine] <> "")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Name1", "Name2", "Name3", "Name4", "Name5", "Name6"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Text.Combine", "UniqueList"}})
in
#"Renamed Columns"
我有一个包含大量数据的大型 table,但我正在查看的是其中的六列 table - 一起从事特定工作的人员的姓名。像这样:
+-------+--------+--------+-------+--------+-------+
| Name1 | Name2 | Name3 | Name4 | Name5 | Name6 |
+-------+--------+--------+-------+--------+-------+
| Rod | Jane | | | | |
| Jane | Freddy | Peter | Paul | | |
| Paul | | | | | |
| Mary | Jane | Rod | Peter | Freddy | Paul |
| Paul | Rod | Freddy | | | |
+-------+--------+--------+-------+--------+-------+
最后我想得到的是这个(在另一个 sheet 上):
+--------+
| Name |
+--------+
| Rod |
| Jane |
| Freddy |
| Peter |
| Paul |
| Mary |
+--------+
我希望能够识别这六列中的所有唯一条目,然后将它们填充到不同的 sheet。我的第一个想法是用公式来做,这很有效(我在 MATCH 部分使用了 INDEX MATCH 和 COUNTIF),但是 table 中有 11000 条记录和可能涉及的 1200 条不同的名称,并且它花了一天的大部分时间来处理。我想,希望,使用 VBA 会使它 运行 更快。
我查看了许多可能的答案。首先,我去了这里: Populate unique values into a VBA array from Excel ,并查看了 brettdj 的答案(因为我有点理解它的去向),最后得到以下代码:
Dim X
Dim objDict As Object
Dim lngRow As Long
Sheets("Data").Select
Set objDict = CreateObject("Scripting.Dictionary")
X = Application.Transpose(Range([be2], Cells(Rows.Count, "BE").End(xlUp)))
For lngRow = 1 To UBound(X, 1)
objDict(X(lngRow)) = 1
Next
Sheets("Crew").Select
Range("A2:A" & objDict.Count) = Application.Transpose(objDict.keys)
End Sub
对于一列(BE 是上面 table 中的 Name1 列 - Data 是存储数据的 sheet,Crew 是 sheet 其中我想要独特的价值)。但是我终其一生都无法弄清楚如何让它从多个列(BE 到 BJ)中获取值。
然后我尝试了这个,源自 Jeremy Thompson 在
Sheets("Data").Select
Range("BE:BJ").AdvancedFilter Action:=xlFilterCopy, CopyToRange:=Sheets("Crew").Range("A:A"), Unique:=True
但同样,我无法将多列信息合二为一。第三次尝试,我查看了来自 How to extract unique values from two columns Excel VBA 的加里学生的回答并尝试了这个:
Dim Na As Long, Nc As Long, Ne As Long
Dim i As Long
Na = Sheets("Stroke Data").Cells(Rows.Count, "BE").End(xlUp).Row
Nc = Sheets("Stroke Data").Cells(Rows.Count, "BF").End(xlUp).Row
Ne = 1
For i = 1 To Na
Cells(Ne, "E").Value = Cells(i, "A").Value
Ne = Ne + 1
Next i
For i = 1 To Na
Cells(Ne, "E").Value = Cells(i, "C").Value
Ne = Ne + 1
Next i
Sheets("Fail").Range("A:A").RemoveDuplicates Columns:=1, Header:=xlNo
(在那一列中只尝试了两列,看看我是否可以那样解决,但是不行)
我真的很茫然。正如您可能从上面看到的那样,我正在疯狂地四处乱窜,并试图从三个不同的角度来解决这个问题,但一无所获。我觉得必须有办法让第一个成功,如果没有别的,因为它 nearly 成功了。但我不明白。
我想我可以 运行 它用于四个单独的列,然后有一个将四个合并为一个的过程。但即使那样,我也不确定如何删除重复项(正如您在上面的 table 中看到的那样,名称可以出现在任何列中)。
只要我最终能得到一个包含唯一名称列表的列,并且不需要花费数小时来处理,我想我真的不介意我是如何到达那里的。
这有点冗长,但对您的示例数据有用。 (可能需要调整初始 rng
的设置方式)。
Sub unique_names()
Dim rng As Range
Set rng = ActiveSheet.UsedRange
Dim col As Range, cel As Range
Dim names() As Variant
ReDim names(rng.Cells.Count)
Dim i As Long
i = 0
'First, let's add all the names to the array
For Each col In rng.Columns
For Each cel In col.Cells
If cel.Value <> "" Then
names(i) = cel.Value
i = i + 1
End If
Next cel
Next col
' Now, extract unique names from the array
Dim arr As New Collection, a
Set arr = unique_values(names)
For i = 1 To arr.Count
Worksheets("Sheet1").Cells(i, 10) = arr(i)
Next
End Sub
Private Function unique_values(iArr As Variant) As Collection
'
Dim arr As New Collection, a
On Error Resume Next
For Each a In iArr
arr.Add a, a
Next
Set unique_values = arr
End Function
这是一种使用字典的方法。只需指定要搜索的范围,RangeToDictionary
函数就会完成剩下的工作。我假设您不想包含空白,所以我删除了那些。
Private Function RangeToDictionary(MyRange As Range) As Object
If MyRange Is Nothing Then Exit Function
If MyRange.Cells.Count < 1 Then Exit Function
Dim cell As Range
Dim dict As Object: Set dict = CreateObject("Scripting.Dictionary")
For Each cell In MyRange
If Not dict.exists(Trim$(cell.Value2)) And Trim$(cell.Value2) <> vbNullString Then dict.Add cell.Value2, cell.Value2
Next
Set RangeToDictionary = dict
End Function
Sub Example()
Dim dict As Object
Dim rng As Range:Set rng = ThisWorkbook.Sheets("Sheet1").Range("A1:f5")
Dim outsheet As Worksheet:Set outsheet = ThisWorkbook.Sheets("Sheet2")
Set dict = RangeToDictionary(rng)
outsheet.Range(outsheet.Cells(1, 1), outsheet.Cells(dict.Count, 1)) = Application.Transpose(dict.items())
End Sub
这将提示您 select 一个范围(可以通过按住 CTRL select 一个不连续的范围),然后将从 selected 范围中提取唯一值并在新的 sheet:
上输出结果Sub tgr()
Dim wb As Workbook
Dim wsDest As Worksheet
Dim rData As Range
Dim rArea As Range
Dim aData As Variant
Dim i As Long, j As Long
Dim hUnq As Object
'Prompt to select range. Uniques will be extracted from the range selected.
'Can select a non-contiguous range by holding CTRL
On Error Resume Next
Set rData = Application.InputBox("Select range of names where unique names will be extracted:", "Data Selection", Selection.Address, Type:=8)
On Error GoTo 0
If rData Is Nothing Then Exit Sub 'Pressed cancel
Set hUnq = CreateObject("Scripting.Dictionary")
For Each rArea In rData.Areas
If rArea.Cells.Count = 1 Then
ReDim aData(1 To 1, 1 To 1)
aData(1, 1) = rArea.Value
Else
aData = rArea.Value
End If
For i = 1 To UBound(aData, 1)
For j = 1 To UBound(aData, 2)
If Not hUnq.Exists(aData(i, j)) And Len(Trim(aData(i, j))) > 0 Then hUnq(Trim(aData(i, j))) = Trim(aData(i, j))
Next j
Next i
Next rArea
Set wb = rData.Parent.Parent 'First parent is the range's worksheet, second parent is the worksheet's workbook
Set wsDest = wb.Sheets.Add(After:=wb.Sheets(wb.Sheets.Count))
wsDest.Range("A1").Resize(hUnq.Count).Value = Application.Transpose(hUnq.Items)
End Sub
假设您 Excel 2016 年及以后,您可以使用 Power Query 执行此操作。在 Data > Get & Transform 中将您的数据范围转换为 table、select table、select "From Table" 中的一个单元格,然后将以下代码粘贴到Power Query 编辑器的高级编辑器(将 Table3 更改为您的 table 最终名称)。
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Name1", type text}, {"Name2", type text}, {"Name3", type text}, {"Name4", type text}, {"Name5", type text}, {"Name6", type text}}),
#"Replaced Value" = Table.ReplaceValue(#"Changed Type"," ","",Replacer.ReplaceText,{"Name1", "Name2", "Name3", "Name4", "Name5", "Name6"}),
#"Added Custom" = Table.AddColumn(#"Replaced Value", "Text.Combine", each Text.Combine({[#"Name1"],[#"Name2"],[#"Name3"],[#"Name4"],[#"Name5"],[#"Name6"]},";")),
#"Replaced Value1" = Table.ReplaceValue(#"Added Custom",";;","",Replacer.ReplaceText,{"Text.Combine"}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value1", {{"Text.Combine", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Text.Combine"),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Text.Combine", type text}}),
#"Removed Duplicates" = Table.Distinct(#"Changed Type1", {"Text.Combine"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Duplicates", each ([Text.Combine] <> "")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Name1", "Name2", "Name3", "Name4", "Name5", "Name6"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Text.Combine", "UniqueList"}})
in
#"Renamed Columns"