Power Query 公式语言 - 根据父相邻列值获取子项
Power Query Formula Language - Get children based on parent adjacent column value
请耐心等待,这是我第一次尝试使用 Power Query 公式语言。我需要一些关于如何解决排序和过滤源数据的特定问题的建议。
我现在得到了这个当前源数据,结构如下:
使用这个强大的查询:
let
Source = Excel.CurrentWorkbook(){[Name="EmployeeOrganization"]}[Content],
ListEmployees = Table.Group(Source, {"Organization"}, {{"Employee", each Text.Combine([Employee],","), type text}}),
CountEmployees = Table.AddColumn(ListEmployees, "Count", each List.Count(Text.Split([Employee],","))),
SplitEmployees = Table.SplitColumn(ListEmployees, "Employee", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv),List.Max(CountEmployees[Count])),
Transpose = Table.Transpose(SplitEmployees),
PromoteHeaders = Table.PromoteHeaders(Transpose, [PromoteAllScalars=true])
in
PromoteHeaders
我能够产生以下结果:
为了避免必须将组织名称添加到源中的每个员工,我希望组织名称充当父组,员工作为子组。我还希望结果只获取状态为 Active = Yes 的组织(+ 员工)。
所需的来源应与此类似:
所以想要的结果看起来应该类似于这样:(Apple is gone due to Active = NO)
我被困在这一点上,需要一些关于如何修改我的 Power Query 公式的建议:
- 只获取活跃的组织(不管他们是否有雇员)
- 不知何故 link 子员工到正确的组织。 (无需在每个相邻的员工列中写下组织名称)
(Excel文件可以找到her)
在 PQ 中,您需要填写空白行,然后进行无聚合的透视。
看代码中的注释,跟着Applied Steps
理解算法
来源
自定义函数
重命名: fnPivotAll
//credit: Cam Wallace https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/
(Source as table,
ColToPivot as text,
ColForValues as text)=>
let
PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
#"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
TableFromRecordOfLists = (rec as record, fieldnames as list) =>
let
PartialRecord = Record.SelectFields(rec,fieldnames),
RecordToList = Record.ToList(PartialRecord),
Table = Table.FromColumns(RecordToList,fieldnames)
in
Table,
#"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
#"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
#"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
#"Expanded Values"
基本查询
let
//Read in data and set data types
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8k12yc9LzEkpVtJRAqLI1GKlWJ1oEDMgtSS1CCQK5XvlpyLzEvPgXMeCgpxUiH6/fJgC38SiSiT1jjmZyXAN7vn56TAdyDYmluYgaXHKTwLzYgE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Organization = _t, Employee = _t, Active = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Organization", type text}, {"Employee", type text}, {"Active", type text}}),
//replace blanks with null if not already there
#"Replaced Value" = Table.ReplaceValue(#"Changed Type","",null,Replacer.ReplaceValue,{"Organization", "Employee", "Active"}),
//fill down the Company and active columns
#"Filled Down" = Table.FillDown(#"Replaced Value",{"Organization", "Active"}),
//Filter to show only Active="Yes and Employee not null
#"Filtered Rows" = Table.SelectRows(#"Filled Down", each ([Employee] <> null) and ([Active] = "Yes")),
//Pivot with no aggregation
//could do this with grouping, but easier (and maybe faster, with a custom function
pivotAll = fnPivotAll(#"Filtered Rows","Organization","Employee"),
//remove unneeded Active column and set data types
#"Removed Columns" = Table.RemoveColumns(pivotAll,{"Active"}),
typed = Table.TransformColumnTypes(#"Removed Columns",
List.Transform(Table.ColumnNames(#"Removed Columns"),each {_, Text.Type}))
in
typed
键入 结果
请耐心等待,这是我第一次尝试使用 Power Query 公式语言。我需要一些关于如何解决排序和过滤源数据的特定问题的建议。
我现在得到了这个当前源数据,结构如下:
使用这个强大的查询:
let
Source = Excel.CurrentWorkbook(){[Name="EmployeeOrganization"]}[Content],
ListEmployees = Table.Group(Source, {"Organization"}, {{"Employee", each Text.Combine([Employee],","), type text}}),
CountEmployees = Table.AddColumn(ListEmployees, "Count", each List.Count(Text.Split([Employee],","))),
SplitEmployees = Table.SplitColumn(ListEmployees, "Employee", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv),List.Max(CountEmployees[Count])),
Transpose = Table.Transpose(SplitEmployees),
PromoteHeaders = Table.PromoteHeaders(Transpose, [PromoteAllScalars=true])
in
PromoteHeaders
我能够产生以下结果:
为了避免必须将组织名称添加到源中的每个员工,我希望组织名称充当父组,员工作为子组。我还希望结果只获取状态为 Active = Yes 的组织(+ 员工)。
所需的来源应与此类似:
所以想要的结果看起来应该类似于这样:(Apple is gone due to Active = NO)
我被困在这一点上,需要一些关于如何修改我的 Power Query 公式的建议:
- 只获取活跃的组织(不管他们是否有雇员)
- 不知何故 link 子员工到正确的组织。 (无需在每个相邻的员工列中写下组织名称)
(Excel文件可以找到her)
在 PQ 中,您需要填写空白行,然后进行无聚合的透视。
看代码中的注释,跟着Applied Steps
理解算法
来源
自定义函数
重命名: fnPivotAll
//credit: Cam Wallace https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/
(Source as table,
ColToPivot as text,
ColForValues as text)=>
let
PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
#"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
TableFromRecordOfLists = (rec as record, fieldnames as list) =>
let
PartialRecord = Record.SelectFields(rec,fieldnames),
RecordToList = Record.ToList(PartialRecord),
Table = Table.FromColumns(RecordToList,fieldnames)
in
Table,
#"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
#"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
#"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
#"Expanded Values"
基本查询
let
//Read in data and set data types
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8k12yc9LzEkpVtJRAqLI1GKlWJ1oEDMgtSS1CCQK5XvlpyLzEvPgXMeCgpxUiH6/fJgC38SiSiT1jjmZyXAN7vn56TAdyDYmluYgaXHKTwLzYgE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Organization = _t, Employee = _t, Active = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Organization", type text}, {"Employee", type text}, {"Active", type text}}),
//replace blanks with null if not already there
#"Replaced Value" = Table.ReplaceValue(#"Changed Type","",null,Replacer.ReplaceValue,{"Organization", "Employee", "Active"}),
//fill down the Company and active columns
#"Filled Down" = Table.FillDown(#"Replaced Value",{"Organization", "Active"}),
//Filter to show only Active="Yes and Employee not null
#"Filtered Rows" = Table.SelectRows(#"Filled Down", each ([Employee] <> null) and ([Active] = "Yes")),
//Pivot with no aggregation
//could do this with grouping, but easier (and maybe faster, with a custom function
pivotAll = fnPivotAll(#"Filtered Rows","Organization","Employee"),
//remove unneeded Active column and set data types
#"Removed Columns" = Table.RemoveColumns(pivotAll,{"Active"}),
typed = Table.TransformColumnTypes(#"Removed Columns",
List.Transform(Table.ColumnNames(#"Removed Columns"),each {_, Text.Type}))
in
typed
键入 结果