Power Query 公式语言 - 根据父相邻列值获取子项

Question

请耐心等待，这是我第一次尝试使用 Power Query 公式语言。我需要一些关于如何解决排序和过滤源数据的特定问题的建议。

我现在得到了这个当前源数据，结构如下：

使用这个强大的查询：

let
    Source = Excel.CurrentWorkbook(){[Name="EmployeeOrganization"]}[Content],
    ListEmployees = Table.Group(Source, {"Organization"}, {{"Employee", each Text.Combine([Employee],","), type text}}),
    CountEmployees = Table.AddColumn(ListEmployees, "Count", each List.Count(Text.Split([Employee],","))),
    SplitEmployees = Table.SplitColumn(ListEmployees, "Employee", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv),List.Max(CountEmployees[Count])),
    Transpose = Table.Transpose(SplitEmployees),
    PromoteHeaders = Table.PromoteHeaders(Transpose, [PromoteAllScalars=true])
in
    PromoteHeaders

我能够产生以下结果：

为了避免必须将组织名称添加到源中的每个员工，我希望组织名称充当父组，员工作为子组。我还希望结果只获取状态为 Active = Yes 的组织（+ 员工）。

所需的来源应与此类似：

所以想要的结果看起来应该类似于这样：（Apple is gone due to Active = NO）

我被困在这一点上，需要一些关于如何修改我的 Power Query 公式的建议：

只获取活跃的组织（不管他们是否有雇员）
不知何故 link 子员工到正确的组织。（无需在每个相邻的员工列中写下组织名称）

(Excel文件可以找到her)

Answer 1

在 PQ 中，您需要填写空白行，然后进行无聚合的透视。

看代码中的注释，跟着Applied Steps理解算法

来源

自定义函数
重命名： fnPivotAll

//credit: Cam Wallace  https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/

(Source as table,
    ColToPivot as text,
    ColForValues as text)=> 

let
     PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
     #"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
 
    TableFromRecordOfLists = (rec as record, fieldnames as list) =>
    
    let
        PartialRecord = Record.SelectFields(rec,fieldnames),
        RecordToList = Record.ToList(PartialRecord),
        Table = Table.FromColumns(RecordToList,fieldnames)
    in
        Table,
 
    #"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
    #"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
    #"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
    #"Expanded Values"

基本查询

let

//Read in data and set data types
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8k12yc9LzEkpVtJRAqLI1GKlWJ1oEDMgtSS1CCQK5XvlpyLzEvPgXMeCgpxUiH6/fJgC38SiSiT1jjmZyXAN7vn56TAdyDYmluYgaXHKTwLzYgE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Organization = _t, Employee = _t, Active = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Organization", type text}, {"Employee", type text}, {"Active", type text}}),

//replace blanks with null if not already there
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type","",null,Replacer.ReplaceValue,{"Organization", "Employee", "Active"}),

//fill down the Company and active columns    
    #"Filled Down" = Table.FillDown(#"Replaced Value",{"Organization", "Active"}),

//Filter to show only Active="Yes and Employee not null
    #"Filtered Rows" = Table.SelectRows(#"Filled Down", each ([Employee] <> null) and ([Active] = "Yes")),

//Pivot with no aggregation
//could do this with grouping, but easier (and maybe faster, with a custom function
    pivotAll = fnPivotAll(#"Filtered Rows","Organization","Employee"),

//remove unneeded Active column and set data types
    #"Removed Columns" = Table.RemoveColumns(pivotAll,{"Active"}),
    typed = Table.TransformColumnTypes(#"Removed Columns",
        List.Transform(Table.ColumnNames(#"Removed Columns"),each {_, Text.Type}))

in
    typed

键入结果

Power Query 公式语言 - 根据父相邻列值获取子项

Power Query Formula Language - Get children based on parent adjacent column value

excel

powerquery