如何将行中的重复数据转置为列
How to transpose duplicated data in rows into columns
我目前正在尝试使用 Excel VBA 清理大型数据集。数据集结构如下所示。
但是,我想让它看起来像这样,如果 A:D 列中的单元格都包含相同的值,则转置 E 列中的单元格。(并从 A:D)
这是我做的代码
Dim ws As Worksheet: Set ws = Sheets("test")
lastrow = ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
Dim j As Integer
j = 6
For i = 2 To lastrow
If (Range("A" & i).Value = Range("A" & i + 1).Value) And (Range("B" & i).Value = Range("B" & i + 1).Value) And (Range("C" & i).Value = Range("C" & i + 1).Value) Then
Cells(i, j).Value = Cells(i + 1, 5).Value
j = j + 1
End If
'Reset J back to 6 if columns A to D does not match previous
If (Range("A" & i).Value <> Range("A" & i + 1).Value) Or (Range("B" & i).Value <> Range("B" & i + 1).Value) Or (Range("C" & i).Value <> Range("C" & i + 1).Value) Then
j = 6
End If
Next i
如何做到这一点?
这最终比我想象的要复杂,但似乎工作正常
Sub Compact()
Const KEY_COLS As Long = 4
Dim ws As Worksheet, i As Long, k As String, nextEmpty As Long
Dim dict As Object
Set dict = CreateObject("scripting.dictionary")
Set ws = Sheets("test")
For i = 1 To ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
'create a row "key" from first KEY_COLS cells
k = Join(Application.Transpose(Application.Transpose( _
ws.Cells(i, 1).Resize(1, KEY_COLS))), "~~")
If Not dict.exists(k) Then
'move this row up?
If nextEmpty > 0 Then
ws.Cells(i, 1).Resize(1, KEY_COLS + 1).Cut ws.Cells(nextEmpty, 1)
dict.Add k, nextEmpty 'new key - store row#
nextEmpty = 0
Else
dict.Add k, i 'new key - store row#
End If
Else
'seen this key before - move value to that row and clear
ws.Cells(dict(k), Columns.Count).End(xlToLeft).Offset(0, 1).Value = _
ws.Cells(i, KEY_COLS + 1).Value
ws.Rows(i).ClearContents
If nextEmpty = 0 Then nextEmpty = i 'available row
End If
Next i
End Sub
编辑:我认为这更简洁一些。它分为单独的“读”和“写”部分。
Sub Compact2()
Const KEY_COLS As Long = 4
Const SEP As String = "~~"
Dim ws As Worksheet, i As Long, k, col As Long, v
Dim dict As Object
Set dict = CreateObject("scripting.dictionary")
Set ws = Sheets("test")
'collect all the unique combinations and associated values
For i = 1 To ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
'create a row "key" from first KEY_COLS cells
k = Join(Application.Transpose(Application.Transpose( _
ws.Cells(i, 1).Resize(1, KEY_COLS))), SEP)
If Not dict.exists(k) Then dict.Add k, New Collection
dict(k).Add ws.Cells(i, KEY_COLS + 1).Value
ws.Rows(i).ClearContents 'clear row
Next i
're-populate the sheet from the dictionary
i = 1
For Each k In dict
ws.Cells(i, 1).Resize(1, KEY_COLS).Value = Split(k, SEP)
col = KEY_COLS + 1
For Each v In dict(k)
ws.Cells(i, col) = v
col = col + 1
Next v
i = i + 1
Next k
End Sub
同意蒂姆·威廉姆斯的观点,这很棘手。我在该工作表中没有使用 VBA 的情况下接近解决方案(需要启用溢出范围)。我没有得到一个动态公式来计算数值,但你可以制作一个宏来拖动它或其他东西。
您需要在单元格 i1
中使用以下公式
=UNIQUE(FILTER(A:D,NOT(ISBLANK((A:A)))))
以下公式将在 M1
中,并向下拖动以匹配紧靠左侧的相应列。您可以设置一个宏,在更改事件中实际为您执行此操作。可能有一种方法可以使用数组公式使它动态化,但我无法及时 assemble 修改它。
=TRANSPOSE(FILTER(E:E,(NOT(ISBLANK(E:E))*(A:A&B:B&C:C&D:D=I1&J1&K1&L1))))
同样,如果您没有 excel 溢出范围功能,这将不起作用。要查看溢出范围,请通过网络浏览器检出 excel 文件,使其看起来如下图所示。灰色单元格包含相应的公式。
您可以使用 Power Query 轻松做到这一点
- 按前四列分组
- 将第 5 列聚合为分隔符(分号)分隔的文本字符串。
- 将分隔字符串拆分为新列
例如,我添加了一些四列不匹配的行
使用 Power Query
- Select 数据中的某个单元格 Table
Data => Get&Transform => from Table/Range
- 当 PQ 编辑器打开时:
Home => Advanced Editor
- 记下第 2 行中的 Table 名称
- 粘贴下面的 M 代码代替您看到的内容
- 将第 2 行中的 Table 名称更改回最初生成的名称。
- 阅读评论并探索
Applied Steps
以了解算法
M码
let
Source = Excel.CurrentWorkbook(){[Name="Table17"]}[Content],
//set type for all columns as Text
#"Changed Type" = Table.TransformColumnTypes(Source,List.Transform(Table.ColumnNames(Source), each {_, Text.Type})),
//group by first four columns, then aggregate the 5th column semicolon separated
#"Grouped Rows" = Table.Group(#"Changed Type", {"Column1", "Column2", "Column3", "Column4"}, {
{"ColE", each Text.Combine([Column5],";"), Text.Type}
}),
//split the aggregated text into new columns
//may need to edit this step depending on maximum number in the group
#"Split Column by Delimiter" = Table.SplitColumn(#"Grouped Rows", "ColE",
Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"ColE.1", "ColE.2", "ColE.3"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{
{"ColE.1", Int64.Type}, {"ColE.2", Int64.Type}, {"ColE.3", Int64.Type}})
in
#"Changed Type1"
我目前正在尝试使用 Excel VBA 清理大型数据集。数据集结构如下所示。
但是,我想让它看起来像这样,如果 A:D 列中的单元格都包含相同的值,则转置 E 列中的单元格。(并从 A:D)
这是我做的代码
Dim ws As Worksheet: Set ws = Sheets("test")
lastrow = ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
Dim j As Integer
j = 6
For i = 2 To lastrow
If (Range("A" & i).Value = Range("A" & i + 1).Value) And (Range("B" & i).Value = Range("B" & i + 1).Value) And (Range("C" & i).Value = Range("C" & i + 1).Value) Then
Cells(i, j).Value = Cells(i + 1, 5).Value
j = j + 1
End If
'Reset J back to 6 if columns A to D does not match previous
If (Range("A" & i).Value <> Range("A" & i + 1).Value) Or (Range("B" & i).Value <> Range("B" & i + 1).Value) Or (Range("C" & i).Value <> Range("C" & i + 1).Value) Then
j = 6
End If
Next i
如何做到这一点?
这最终比我想象的要复杂,但似乎工作正常
Sub Compact()
Const KEY_COLS As Long = 4
Dim ws As Worksheet, i As Long, k As String, nextEmpty As Long
Dim dict As Object
Set dict = CreateObject("scripting.dictionary")
Set ws = Sheets("test")
For i = 1 To ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
'create a row "key" from first KEY_COLS cells
k = Join(Application.Transpose(Application.Transpose( _
ws.Cells(i, 1).Resize(1, KEY_COLS))), "~~")
If Not dict.exists(k) Then
'move this row up?
If nextEmpty > 0 Then
ws.Cells(i, 1).Resize(1, KEY_COLS + 1).Cut ws.Cells(nextEmpty, 1)
dict.Add k, nextEmpty 'new key - store row#
nextEmpty = 0
Else
dict.Add k, i 'new key - store row#
End If
Else
'seen this key before - move value to that row and clear
ws.Cells(dict(k), Columns.Count).End(xlToLeft).Offset(0, 1).Value = _
ws.Cells(i, KEY_COLS + 1).Value
ws.Rows(i).ClearContents
If nextEmpty = 0 Then nextEmpty = i 'available row
End If
Next i
End Sub
编辑:我认为这更简洁一些。它分为单独的“读”和“写”部分。
Sub Compact2()
Const KEY_COLS As Long = 4
Const SEP As String = "~~"
Dim ws As Worksheet, i As Long, k, col As Long, v
Dim dict As Object
Set dict = CreateObject("scripting.dictionary")
Set ws = Sheets("test")
'collect all the unique combinations and associated values
For i = 1 To ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
'create a row "key" from first KEY_COLS cells
k = Join(Application.Transpose(Application.Transpose( _
ws.Cells(i, 1).Resize(1, KEY_COLS))), SEP)
If Not dict.exists(k) Then dict.Add k, New Collection
dict(k).Add ws.Cells(i, KEY_COLS + 1).Value
ws.Rows(i).ClearContents 'clear row
Next i
're-populate the sheet from the dictionary
i = 1
For Each k In dict
ws.Cells(i, 1).Resize(1, KEY_COLS).Value = Split(k, SEP)
col = KEY_COLS + 1
For Each v In dict(k)
ws.Cells(i, col) = v
col = col + 1
Next v
i = i + 1
Next k
End Sub
同意蒂姆·威廉姆斯的观点,这很棘手。我在该工作表中没有使用 VBA 的情况下接近解决方案(需要启用溢出范围)。我没有得到一个动态公式来计算数值,但你可以制作一个宏来拖动它或其他东西。
您需要在单元格 i1
=UNIQUE(FILTER(A:D,NOT(ISBLANK((A:A)))))
以下公式将在 M1
中,并向下拖动以匹配紧靠左侧的相应列。您可以设置一个宏,在更改事件中实际为您执行此操作。可能有一种方法可以使用数组公式使它动态化,但我无法及时 assemble 修改它。
=TRANSPOSE(FILTER(E:E,(NOT(ISBLANK(E:E))*(A:A&B:B&C:C&D:D=I1&J1&K1&L1))))
同样,如果您没有 excel 溢出范围功能,这将不起作用。要查看溢出范围,请通过网络浏览器检出 excel 文件,使其看起来如下图所示。灰色单元格包含相应的公式。
您可以使用 Power Query 轻松做到这一点
- 按前四列分组
- 将第 5 列聚合为分隔符(分号)分隔的文本字符串。
- 将分隔字符串拆分为新列
例如,我添加了一些四列不匹配的行
使用 Power Query
- Select 数据中的某个单元格 Table
Data => Get&Transform => from Table/Range
- 当 PQ 编辑器打开时:
Home => Advanced Editor
- 记下第 2 行中的 Table 名称
- 粘贴下面的 M 代码代替您看到的内容
- 将第 2 行中的 Table 名称更改回最初生成的名称。
- 阅读评论并探索
Applied Steps
以了解算法
M码
let
Source = Excel.CurrentWorkbook(){[Name="Table17"]}[Content],
//set type for all columns as Text
#"Changed Type" = Table.TransformColumnTypes(Source,List.Transform(Table.ColumnNames(Source), each {_, Text.Type})),
//group by first four columns, then aggregate the 5th column semicolon separated
#"Grouped Rows" = Table.Group(#"Changed Type", {"Column1", "Column2", "Column3", "Column4"}, {
{"ColE", each Text.Combine([Column5],";"), Text.Type}
}),
//split the aggregated text into new columns
//may need to edit this step depending on maximum number in the group
#"Split Column by Delimiter" = Table.SplitColumn(#"Grouped Rows", "ColE",
Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"ColE.1", "ColE.2", "ColE.3"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{
{"ColE.1", Int64.Type}, {"ColE.2", Int64.Type}, {"ColE.3", Int64.Type}})
in
#"Changed Type1"