如何在 PowerQuery 中创建自定义索引列?

How do I create a custom index column in PowerQuery?

我在 PowerQuery 中有以下数据:

| ParentX | A |
| ParentY | A |
| ParentZ | A |
| ParentY | B |
| ParentZ | B |
| ParentX | C |
| ParentY | C |
| ParentZ | C |

我想添加一个索引列来计算元素的父项数量:

| ParentX | A | 3 |
| ParentY | A | 2 |
| ParentZ | A | 1 |
| ParentY | B | 2 |
| ParentZ | B | 1 |
| ParentX | C | 3 |
| ParentY | C | 2 |
| ParentZ | C | 1 |

最终目标是基于这个新列进行数据透视,如下所示:

| Object | Root    | Parent 2 | Parent 3 |
| A      | ParentZ | ParentY  | ParentX  |
| B      | ParentZ | ParentY  |          |
| C      | ParentZ | ParentY  | ParentX  |
  1. 创建一个包含 2 列的 Excel Table(ParentsChild
  2. 在 Power Query
  3. 中使用此 Table
  4. 插入函数Combiner.CombineTextByDelimiter(";")(参见第 3 行)
  5. Child 分组并使用上面的函数(参见第 4 行)
  6. 拆分结果(第 5 行)

代码:

let
    Quelle    = Excel.CurrentWorkbook(){[Name="Tabelle2"]}[Content],
    fcombine  = Combiner.CombineTextByDelimiter(";"), 
    #"Group1" = Table.Group(Quelle, {"Child"}, {{"Parents", each fcombine([Parent]), type text}}),
    #"Split1" = Table.SplitColumn(#"Group1", "Parents", Splitter.SplitTextByDelimiter(";"),{"Parents.1", "Parents.2", "Parents.3"}),
    #"Result" = Table.TransformColumnTypes(#"Split1", {{"Parents.1", type text}, {"Parents.2", type text}, {"Parents.3", type text}})
in
    #"Result"

您好 R.

这是我用来生成问题中索引列的查询:

let
    // This has the original parent/child column
    Source = #"Parent Child Query",

    // Count the number of parents per child
    #"Grouped Rows" = Table.Group(Source, {"Attribute:id"}, {{"Count", each Table.RowCount(_), type number}}),

    // Add a new column of lists with the indexes per child
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "ParentIndex", each List.Numbers([Count], [Count], -1)),

    // Expand the lists in the previous step
    #"Expand ParentIndex" = Table.ExpandListColumn(#"Added Custom", "ParentIndex"),

    // Create the column name columns (Parent.1, Parent.2, etc)
    #"Added Custom1" = Table.AddColumn(#"Expand ParentIndex", "ParentColumn", each "Parent."&Text.From([ParentIndex])),

    // Adds an index column that you use when merging with the original table
    #"Added Index" = Table.AddIndexColumn(#"Added Custom1", "Index", 0, 1)
in
    #"Added Index"

完成后,我创建了另一个查询来保存合并结果:

let
    // This is the original parent/child column
    Source = #"Parent Child Query",

    // Add an index column that matches the index column in the previous query
    #"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),

    // Merge the two queries based on the index columns
    Merge = Table.NestedJoin(#"Added Index",{"Index"},#"Epic Parent Indices",{"Index"},"NewColumn"),

    // Expand the new column
    #"Expand NewColumn" = Table.ExpandTableColumn(Merge, "NewColumn", {"ParentColumn"}, {"ParentColumn"}),

    // Remove the index column
    #"Removed Columns" = Table.RemoveColumns(#"Expand NewColumn",{"Index"}),

    // Sort the data by attribute and then by Parent column so the columns will be in the right order
    #"Sorted Rows" = Table.Sort(#"Removed Columns",{{"Attribute:id", Order.Descending}, {"ParentColumn", Order.Ascending}}),

    // Pivot!
    #"Pivoted Column" = Table.Pivot(#"Sorted Rows", List.Distinct(#"Sorted Rows"[ParentColumn]), "ParentColumn","Parent:id")
in
    #"Pivoted Column"

这里有三个关键步骤:

  1. 使用Table.Group获取每个子元素的父元素数量。
  2. 使用 List.Numbers 获取每个 parent/child 关系的索引值。
  3. 使用 Table.AddIndexColumn to add index columns to be used as the key in the call to Table.Join 如果您不这样做,那么您将在合并中获得重复数据。