编写一个可以让 excel 填写并合并一组重复项中未填充字段的查询？

Question

我有一个联系信息数据集（大），其中包含许多半重复的行，我想将它们压缩成尽可能少的行。附件是我正在谈论的示例。

左边的蓝色 table 是我目前正在使用的较小比例的示例。右边的橙色 table 是我希望 table 的样子。

我想编写一个查询，该查询将能够 select 具有多行的 ID，并在该 selection 中评估是否可以将值移动到具有未填充的单元格（参见 ID“4”以及我如何通过填充空白和合并重复项将这三行数据压缩为一行）。

一个重要的强调点是如何执行此任务，而不是对整个工作表中的所有重复项进行笼统声明。最终我想对整个工作表执行此任务，但我希望 excel 首先隔离单个 ID，然后然后执行上述任务，而不是根据以下条件评估标准全部个重复 ID。 ((如果有道理的话))

我想要的另一个条件是对于某些列，其中同一 ID 下的多行具有不同的值，是将该数据分配到后续列（请参阅 ID“1”下的标签和标签 2 列）覆盖单元格。

我只想对 2 列执行此操作 ^；对于其他人，将它们保留为单独的行。

这听起来像是 Power Query 的任务，但我在该领域的知识有限。非常感谢任何有关如何构建完成此任务的查询的帮助。谢谢

Answer 1

这似乎工作正常，但它相当高级，因为它需要自定义代码，所以我不确定它是否会增强您的理解。

let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Title", type text}, {"Company", type text}, {"Phone", type text}, {"Phone2", type any}, {"Street Address", type any}, {"City", type text}, {"Tags", type text}}),
// grooup, then unpivot, remove duplicates
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"Data", each Table.Distinct(Table.UnpivotOtherColumns(_, {"ID"}, "Attribute", "Value"), {"Attribute", "Value"}), type table}}),
// combine all the tags into one cell for later splitting
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.Group([Data], {"ID", "Attribute"}, {{"Data", each Text.Combine([Value],","), type text}})),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Attribute", "Data"}, {"Attribute", "Data.1"}),
// replace null with Title to preserve rows with no data
#"Replaced Value" = Table.ReplaceValue(#"Expanded Custom",null,"Title",Replacer.ReplaceValue,{"Attribute"}),
#"Removed Columns" = Table.RemoveColumns(#"Replaced Value",{"Data"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attribute]), "Attribute", "Data.1"),
// split the Tags column into any number of columns as needed
#"Replaced Value1" = Table.ReplaceValue(#"Pivoted Column",null,"xxx",Replacer.ReplaceValue,{"Tags"}),
DynamicColumnList  = List.Transform({1 ..List.Max(Table.AddColumn(#"Replaced Value1","Custom", each List.Count(Text.PositionOfAny([Tags],{","},Occurrence.All)))[Custom])+1}, each "Tags." & Text.From(_)),
#"Split Column by Delimiter" =  Table.SplitColumn(   #"Pivoted Column", "Tags",  Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), DynamicColumnList)
in  #"Split Column by Delimiter"

Answer 2

您只需使用 Table.Group 函数即可从 Power Query 获得所需的输出。

我假设：

输出列仅如您所示
输入列在 Phone2 和 Tags2 中没有任何内容
- 如果不是这种情况，可以进行简单的修改
如果输出的列中有更多不同的实体，它们将在单个列中串联输出。
- 换句话说，如果你有三个标签；第一个将在 Tags 列中，第二个和第三个在 Tags 2 列中用逗号连接。
- 我这样做是因为，由于您没有展示示例，我不太确定如果您有多个手机和多个标签，您希望如何排列。

注意：如果要限制只用一个ID，在开头插入过滤步骤即可

M码

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Title", type text}, {"Company", type text}, {"Phone", type text}, {"Phone2", type any}, {"Street Address", type text}, {"City", type text}, {"Tags", type text}, {"Tags2", type any}}),

//Group by ID then
//Depending on how many columns available in results table, will 
//either concatenate, multiple non-duplicate rows, or put them in separate columns
    #"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {
        {"Title", each Text.Combine(List.Distinct([Title]),", ")},
        {"Company", each Text.Combine(List.Distinct([Company]),", ")},
        {"Phone", each try List.RemoveNulls([Phone]){0} otherwise null},
        {"Phone 2", each Text.Combine(List.RemoveFirstN(List.RemoveNulls(List.Distinct([Phone])),1),", ")},
        {"City", each Text.Combine(List.Distinct([City]),", ")},
        {"Tags", each try List.RemoveNulls([Tags]){0} otherwise null},
        {"Tags 2", each Text.Combine(List.RemoveFirstN(List.RemoveNulls(List.Distinct([Tags])),1),", ")}      
    })
in
    #"Grouped Rows"

编写一个可以让 excel 填写并合并一组重复项中未填充字段的查询？

Writing a query that can have excel fill in and consolidate unpopulated fields within a group of duplicates?

syntax

excel

m

powerquery

data-cleaning