当第二个数据页标题不可用时避免 Excel PowerQuery 错误
Avoid Excel PowerQuery error when second data page headings not available
我使用 PowerQuery 从如下所示的 CSV 文件中导入数据:
Report Title,,,,,
,Date,Type,USER_ID,PICKED_QTY,No of Hours
,31/10/2021,Type A,User_1,300,3
,31/10/2021,Type A,User_3,250,8
,01/11/2021,Type B,User_1,167,5
,01/11/2021,Type C,User_2,988,2
,02/11/2021,Type A,User_1,1113,4
Date,Type,USER_ID,PICKED_QTY,No of Hours,
03/11/2021,Type C,User_1,1500,5,
04/11/2021,Type A,User_1,200,8,
有时看起来像这样(没有第二页)- 这就是问题所在:
Report Title,,,,,
,Date,Type,USER_ID,PICKED_QTY,No of Hours
,31/10/2021,Type A,User_1,300,3
,31/10/2021,Type A,User_3,250,8
,01/11/2021,Type B,User_1,167,5
,01/11/2021,Type C,User_2,988,2
,02/11/2021,Type A,User_1,1113,4
我使用此 PQ 将数据转换为可读格式(此来源会有所不同,但为简单起见,此处引用 table):
数据源:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
SplitData = Table.SplitColumn(Source,"Column1", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv))
in
SplitData
然后我使用两个查询来排列数据,因此日期列都在同一列中,等等。
查询 1:
let
Source = DataSource,
RemoveTopRows = Table.Skip(Source,1),
PromoteHeaders = Table.PromoteHeaders(RemoveTopRows, [PromoteAllScalars=true]),
FilterRows = Table.SelectRows(PromoteHeaders, each ([#""] = "")),
RemoveOtherColumns = Table.SelectColumns(FilterRows,{"Date", "Type", "USER_ID", "PICKED_QTY", "No of Hours"}),
ChangeType = Table.TransformColumnTypes(RemoveOtherColumns,{{"Date", type date}, {"Type", type text},
{"USER_ID", type text}, {"PICKED_QTY", Int64.Type},
{"No of Hours", Int64.Type}})
in
ChangeType
查询 2:
let
Source = DataSource,
RemoveTopRows = Table.Skip(Source,1),
FilterRows = Table.SelectRows(RemoveTopRows, each ([Column1.1] <> "")),
PromoteHeaders = Table.PromoteHeaders(FilterRows, [PromoteAllScalars=true]),
RemoveOtherColumns = Table.SelectColumns(PromoteHeaders,{"Date", "Type", "USER_ID", "PICKED_QTY", "No of Hours"}),
FilterRows2 = Table.SelectRows(RemoveOtherColumns, each ([Date] <> "Date")),
ChangeType = Table.TransformColumnTypes(FilterRows2,{{"Date", type date}, {"Type", type text}, {"USER_ID", type text},
{"PICKED_QTY", Int64.Type}, {"No of Hours", Int64.Type}})
in
ChangeType
最后,我将前两个查询连接在一起并分组以获得最终的 table。
查询 3:
let
Source = Query1,
AppendQueries = Table.Combine({Source, Query2}),
SortRows = Table.Sort(AppendQueries,{{"Date", Order.Ascending}}),
GroupRows = Table.Group(SortRows, {"Date", "Type"}, {{"Picked Qty", each List.Sum([PICKED_QTY]), type nullable number},
{"Total Hours", each List.Sum([No of Hours]), type nullable number}}),
AddDivision = Table.AddColumn(GroupRows, "Rate", each [Picked Qty] / [Total Hours], type number)
in
AddDivision
问题
有时我的原始数据不包括第二页数据,所以不需要Query2。
发生这种情况时,如果我不为第二页手动添加 headers,则会出现错误:[Expression Error] The column 'Date' of the table wasn't found
.
如何避免这种情况?错误出现在 Query2
和 RemoveOtherColumns
- 没有列 headers 它找不到正确的列,并且在 Query3
中因为它无法附加返回一个查询错误。
无需全部重写,只需将 Query2 的最后一行更改为
in try ChangeType otherwise Table.FromRecords({[Date = null, Type = null, USER_ID=null, PICKED_QTY=null, No of Hours = null]})
或
in try ChangeType otherwise Table.Skip(Table.FromRecords({[Date = null, Type = null, USER_ID=null, PICKED_QTY=null, No of Hours = null]}),1)
正在创建:
或者只在一个查询中完成所有事情
let Source = Csv.Document(File.Contents("C:\temp2\data.csv"),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Removed Top Rows" = Table.Skip(Source,1),
#"Filtered Rows" = Table.PromoteHeaders(Table.SelectRows(#"Removed Top Rows", each [Column1] = ""), [PromoteAllScalars=true]),
#"Filtered Rows2" = Table.PromoteHeaders(Table.SelectRows(#"Removed Top Rows", each [Column1] <> ""), [PromoteAllScalars=true]),
AppendQueries = Table.Combine({#"Filtered Rows",#"Filtered Rows2"}),
SortRows = Table.Sort(AppendQueries,{{"Date", Order.Ascending}}),
#"Changed Type1" = Table.TransformColumnTypes(SortRows,{{"PICKED_QTY", type number}, {"No of Hours", type number}}),
GroupRows = Table.Group(#"Changed Type1", {"Date", "Type"}, {{"Picked Qty", each List.Sum([PICKED_QTY]), type number}, {"Total Hours", each List.Sum([No of Hours]), type number}}),
AddDivision = Table.AddColumn(GroupRows, "Rate", each [Picked Qty] / [Total Hours], type number)
in AddDivision
我使用 PowerQuery 从如下所示的 CSV 文件中导入数据:
Report Title,,,,,
,Date,Type,USER_ID,PICKED_QTY,No of Hours
,31/10/2021,Type A,User_1,300,3
,31/10/2021,Type A,User_3,250,8
,01/11/2021,Type B,User_1,167,5
,01/11/2021,Type C,User_2,988,2
,02/11/2021,Type A,User_1,1113,4
Date,Type,USER_ID,PICKED_QTY,No of Hours,
03/11/2021,Type C,User_1,1500,5,
04/11/2021,Type A,User_1,200,8,
有时看起来像这样(没有第二页)- 这就是问题所在:
Report Title,,,,,
,Date,Type,USER_ID,PICKED_QTY,No of Hours
,31/10/2021,Type A,User_1,300,3
,31/10/2021,Type A,User_3,250,8
,01/11/2021,Type B,User_1,167,5
,01/11/2021,Type C,User_2,988,2
,02/11/2021,Type A,User_1,1113,4
我使用此 PQ 将数据转换为可读格式(此来源会有所不同,但为简单起见,此处引用 table):
数据源:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
SplitData = Table.SplitColumn(Source,"Column1", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv))
in
SplitData
然后我使用两个查询来排列数据,因此日期列都在同一列中,等等。
查询 1:
let
Source = DataSource,
RemoveTopRows = Table.Skip(Source,1),
PromoteHeaders = Table.PromoteHeaders(RemoveTopRows, [PromoteAllScalars=true]),
FilterRows = Table.SelectRows(PromoteHeaders, each ([#""] = "")),
RemoveOtherColumns = Table.SelectColumns(FilterRows,{"Date", "Type", "USER_ID", "PICKED_QTY", "No of Hours"}),
ChangeType = Table.TransformColumnTypes(RemoveOtherColumns,{{"Date", type date}, {"Type", type text},
{"USER_ID", type text}, {"PICKED_QTY", Int64.Type},
{"No of Hours", Int64.Type}})
in
ChangeType
查询 2:
let
Source = DataSource,
RemoveTopRows = Table.Skip(Source,1),
FilterRows = Table.SelectRows(RemoveTopRows, each ([Column1.1] <> "")),
PromoteHeaders = Table.PromoteHeaders(FilterRows, [PromoteAllScalars=true]),
RemoveOtherColumns = Table.SelectColumns(PromoteHeaders,{"Date", "Type", "USER_ID", "PICKED_QTY", "No of Hours"}),
FilterRows2 = Table.SelectRows(RemoveOtherColumns, each ([Date] <> "Date")),
ChangeType = Table.TransformColumnTypes(FilterRows2,{{"Date", type date}, {"Type", type text}, {"USER_ID", type text},
{"PICKED_QTY", Int64.Type}, {"No of Hours", Int64.Type}})
in
ChangeType
最后,我将前两个查询连接在一起并分组以获得最终的 table。
查询 3:
let
Source = Query1,
AppendQueries = Table.Combine({Source, Query2}),
SortRows = Table.Sort(AppendQueries,{{"Date", Order.Ascending}}),
GroupRows = Table.Group(SortRows, {"Date", "Type"}, {{"Picked Qty", each List.Sum([PICKED_QTY]), type nullable number},
{"Total Hours", each List.Sum([No of Hours]), type nullable number}}),
AddDivision = Table.AddColumn(GroupRows, "Rate", each [Picked Qty] / [Total Hours], type number)
in
AddDivision
问题
有时我的原始数据不包括第二页数据,所以不需要Query2。
发生这种情况时,如果我不为第二页手动添加 headers,则会出现错误:[Expression Error] The column 'Date' of the table wasn't found
.
如何避免这种情况?错误出现在 Query2
和 RemoveOtherColumns
- 没有列 headers 它找不到正确的列,并且在 Query3
中因为它无法附加返回一个查询错误。
无需全部重写,只需将 Query2 的最后一行更改为
in try ChangeType otherwise Table.FromRecords({[Date = null, Type = null, USER_ID=null, PICKED_QTY=null, No of Hours = null]})
或
in try ChangeType otherwise Table.Skip(Table.FromRecords({[Date = null, Type = null, USER_ID=null, PICKED_QTY=null, No of Hours = null]}),1)
正在创建:
或者只在一个查询中完成所有事情
let Source = Csv.Document(File.Contents("C:\temp2\data.csv"),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Removed Top Rows" = Table.Skip(Source,1),
#"Filtered Rows" = Table.PromoteHeaders(Table.SelectRows(#"Removed Top Rows", each [Column1] = ""), [PromoteAllScalars=true]),
#"Filtered Rows2" = Table.PromoteHeaders(Table.SelectRows(#"Removed Top Rows", each [Column1] <> ""), [PromoteAllScalars=true]),
AppendQueries = Table.Combine({#"Filtered Rows",#"Filtered Rows2"}),
SortRows = Table.Sort(AppendQueries,{{"Date", Order.Ascending}}),
#"Changed Type1" = Table.TransformColumnTypes(SortRows,{{"PICKED_QTY", type number}, {"No of Hours", type number}}),
GroupRows = Table.Group(#"Changed Type1", {"Date", "Type"}, {{"Picked Qty", each List.Sum([PICKED_QTY]), type number}, {"Total Hours", each List.Sum([No of Hours]), type number}}),
AddDivision = Table.AddColumn(GroupRows, "Rate", each [Picked Qty] / [Total Hours], type number)
in AddDivision