如何将行中的重复数据转置为列

Question

我目前正在尝试使用 Excel VBA 清理大型数据集。数据集结构如下所示。

但是，我想让它看起来像这样，如果 A:D 列中的单元格都包含相同的值，则转置 E 列中的单元格。（并从 A:D)

这是我做的代码

Dim ws As Worksheet: Set ws = Sheets("test")
lastrow = ws.Cells(ws.Rows.Count, "A").End(xlUp).Row

Dim j As Integer
j = 6

For i = 2 To lastrow

    If (Range("A" & i).Value = Range("A" & i + 1).Value) And (Range("B" & i).Value = Range("B" & i + 1).Value) And (Range("C" & i).Value = Range("C" & i + 1).Value) Then
        Cells(i, j).Value = Cells(i + 1, 5).Value
        j = j + 1
    End If
    
    'Reset J back to 6 if columns A to D does not match previous
    If (Range("A" & i).Value <> Range("A" & i + 1).Value) Or (Range("B" & i).Value <> Range("B" & i + 1).Value) Or (Range("C" & i).Value <> Range("C" & i + 1).Value) Then
        j = 6
    End If
    
Next i

如何做到这一点？

Answer 1

这最终比我想象的要复杂，但似乎工作正常

Sub Compact()

    Const KEY_COLS As Long = 4
    Dim ws As Worksheet, i As Long, k As String, nextEmpty As Long
    Dim dict As Object
    Set dict = CreateObject("scripting.dictionary")
    
    Set ws = Sheets("test")
    For i = 1 To ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
        'create a row "key" from first KEY_COLS cells
        k = Join(Application.Transpose(Application.Transpose( _
                  ws.Cells(i, 1).Resize(1, KEY_COLS))), "~~")
        
        If Not dict.exists(k) Then
            'move this row up?
            If nextEmpty > 0 Then
                ws.Cells(i, 1).Resize(1, KEY_COLS + 1).Cut ws.Cells(nextEmpty, 1)
                dict.Add k, nextEmpty 'new key - store row#
                nextEmpty = 0
            Else
                dict.Add k, i 'new key - store row#
            End If
        Else
            'seen this key before - move value to that row and clear
            ws.Cells(dict(k), Columns.Count).End(xlToLeft).Offset(0, 1).Value = _
                ws.Cells(i, KEY_COLS + 1).Value
            ws.Rows(i).ClearContents
            If nextEmpty = 0 Then nextEmpty = i 'available row
        End If
    Next i
End Sub

编辑：我认为这更简洁一些。它分为单独的“读”和“写”部分。

Sub Compact2()

    Const KEY_COLS As Long = 4
    Const SEP As String = "~~"
    Dim ws As Worksheet, i As Long, k, col As Long, v
    Dim dict As Object
    Set dict = CreateObject("scripting.dictionary")
    
    Set ws = Sheets("test")
    'collect all the unique combinations and associated values 
    For i = 1 To ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
        'create a row "key" from first KEY_COLS cells
        k = Join(Application.Transpose(Application.Transpose( _
                  ws.Cells(i, 1).Resize(1, KEY_COLS))), SEP)
        
        If Not dict.exists(k) Then dict.Add k, New Collection
        dict(k).Add ws.Cells(i, KEY_COLS + 1).Value
        ws.Rows(i).ClearContents 'clear row
    Next i
    
    're-populate the sheet from the dictionary
    i = 1
    For Each k In dict
        ws.Cells(i, 1).Resize(1, KEY_COLS).Value = Split(k, SEP)
        col = KEY_COLS + 1
        For Each v In dict(k)
            ws.Cells(i, col) = v
            col = col + 1
        Next v
        i = i + 1
    Next k
End Sub

Answer 2

同意蒂姆·威廉姆斯的观点，这很棘手。我在该工作表中没有使用 VBA 的情况下接近解决方案（需要启用溢出范围）。我没有得到一个动态公式来计算数值，但你可以制作一个宏来拖动它或其他东西。

See this spreadsheet.

您需要在单元格 i1

中使用以下公式

=UNIQUE(FILTER(A:D,NOT(ISBLANK((A:A)))))

以下公式将在 M1 中，并向下拖动以匹配紧靠左侧的相应列。您可以设置一个宏，在更改事件中实际为您执行此操作。可能有一种方法可以使用数组公式使它动态化，但我无法及时 assemble 修改它。

=TRANSPOSE(FILTER(E:E,(NOT(ISBLANK(E:E))*(A:A&B:B&C:C&D:D=I1&J1&K1&L1))))

同样，如果您没有 excel 溢出范围功能，这将不起作用。要查看溢出范围，请通过网络浏览器检出 excel 文件，使其看起来如下图所示。灰色单元格包含相应的公式。

Answer 3

您可以使用 Power Query 轻松做到这一点

按前四列分组
将第 5 列聚合为分隔符（分号）分隔的文本字符串。
将分隔字符串拆分为新列

例如，我添加了一些四列不匹配的行

使用 Power Query

Select 数据中的某个单元格 Table
Data => Get&Transform => from Table/Range
当 PQ 编辑器打开时：Home => Advanced Editor
记下第 2 行中的 Table 名称
粘贴下面的 M 代码代替您看到的内容
将第 2 行中的 Table 名称更改回最初生成的名称。
阅读评论并探索 Applied Steps 以了解算法

M码

let
    Source = Excel.CurrentWorkbook(){[Name="Table17"]}[Content],

//set type for all columns as Text
    #"Changed Type" = Table.TransformColumnTypes(Source,List.Transform(Table.ColumnNames(Source), each {_, Text.Type})),

//group by first four columns, then aggregate the 5th column semicolon separated
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Column1", "Column2", "Column3", "Column4"}, {
        {"ColE", each Text.Combine([Column5],";"), Text.Type}
    }),

//split the aggregated text into new columns
//may need to edit this step depending on maximum number in the group
    #"Split Column by Delimiter" = Table.SplitColumn(#"Grouped Rows", "ColE", 
        Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"ColE.1", "ColE.2", "ColE.3"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{
        {"ColE.1", Int64.Type}, {"ColE.2", Int64.Type}, {"ColE.3", Int64.Type}})
in
    #"Changed Type1"

如何将行中的重复数据转置为列

How to transpose duplicated data in rows into columns

excel

vba

transpose