如何在 Power Query / Power BI 的列中插入缺失值?
How can I interpolate missing values in a column in Power Query / Power BI?
那里有一个 post 在一个非常具体的例子中描述了如何做到这一点:
https://community.powerbi.com/t5/Community-Blog/Linear-Interpolation-with-Power-BI/ba-p/341202
但是代码不是很便携,因为它通过名称等引用特定的列。
它也没有将代码打包为函数,因此您的电源查询将充满大量额外的步骤和变量。
我写了一个(相对)通用的函数,用于在幂查询中插入值(对幂 bi 和 m 代码也很有用)。
它需要一个 table 和两个列名作为输入。它输出 table 已根据最近的 x, y 对插入缺失的 y 值 - 如传递数据的顺序所示(不是数字接近度)。如果您需要数值接近度,只需在传递给此函数之前按 x(可能还有缓冲区)排序。
(Input as table, xColumn as text, yColumn as text) =>
//Interpolates missing yColumn values based on nearest existing xColumn, yColumn pairs
let
Buffer = Table.Buffer(Input),
//index for joining calculations and preserving original order
#"Added Main Index" = Table.AddIndexColumn(Buffer, "InterpolateMainIndex", 0, 1),
#"Two Columns and Index" = Table.RemoveColumns(#"Added Main Index", List.Select(Table.ColumnNames(#"Added Main Index"), each _ <> xColumn and _ <> yColumn and _ <> "InterpolateMainIndex")),
#"Remove Blanks" = Table.SelectRows(#"Two Columns and Index", each Record.Field(_, yColumn) <> null and Record.Field(_, yColumn) <> ""),
//index for refering to next non-blank record
#"Added Sub Index" = Table.AddIndexColumn(#"Remove Blanks", "InterpolateSubIndex", 0, 1),
//m = (y2 - y1) / (x2 - x1)
m = Table.AddColumn(#"Added Sub Index",
"m",
each (Number.From(Record.Field(_, yColumn))-Number.From(Record.Field(#"Added Sub Index"{[InterpolateSubIndex]+1}, yColumn))) /
(Number.From(Record.Field(_, xColumn))-Number.From(Record.Field(#"Added Sub Index"{[InterpolateSubIndex]+1}, xColumn))),
type number),
//b = y - m * x
b = Table.AddColumn(m, "b", each Record.Field(_, yColumn) - [#"m"] * Number.From(Record.Field(_, xColumn)), type number),
//rename or remove columns to allow full join
#"Renamed Columns" = Table.RenameColumns(b,{{"InterpolateMainIndex", "InterpolateMainIndexCopy"}}),
xColumnmb = Table.RemoveColumns(#"Renamed Columns",{yColumn, xColumn, "InterpolateSubIndex"}),
Join = Table.Join(#"Added Main Index", "InterpolateMainIndex", xColumnmb, "InterpolateMainIndexCopy", JoinKind.FullOuter),
//enforce orignal sorting
#"Sorted by Main Index" = Table.Sort(Join,{{"InterpolateMainIndex", Order.Ascending}}),
#"Filled Down mb" = Table.FillDown(#"Sorted by Main Index",{"m", "b"}),
//y = m * x + b
Interpolate = Table.ReplaceValue(#"Filled Down mb",null,each ([m] * Number.From(Record.Field(_, xColumn)) + [b]),Replacer.ReplaceValue,{yColumn}),
//clean up
#"Remove Temporary Columns" = Table.RemoveColumns(Interpolate,{"m", "b", "InterpolateMainIndex", "InterpolateMainIndexCopy"}),
#"Restore Types" = Value.ReplaceType(#"Remove Temporary Columns", Value.Type(Input))
in
#"Restore Types"
那里有一个 post 在一个非常具体的例子中描述了如何做到这一点:
https://community.powerbi.com/t5/Community-Blog/Linear-Interpolation-with-Power-BI/ba-p/341202
但是代码不是很便携,因为它通过名称等引用特定的列。
它也没有将代码打包为函数,因此您的电源查询将充满大量额外的步骤和变量。
我写了一个(相对)通用的函数,用于在幂查询中插入值(对幂 bi 和 m 代码也很有用)。
它需要一个 table 和两个列名作为输入。它输出 table 已根据最近的 x, y 对插入缺失的 y 值 - 如传递数据的顺序所示(不是数字接近度)。如果您需要数值接近度,只需在传递给此函数之前按 x(可能还有缓冲区)排序。
(Input as table, xColumn as text, yColumn as text) =>
//Interpolates missing yColumn values based on nearest existing xColumn, yColumn pairs
let
Buffer = Table.Buffer(Input),
//index for joining calculations and preserving original order
#"Added Main Index" = Table.AddIndexColumn(Buffer, "InterpolateMainIndex", 0, 1),
#"Two Columns and Index" = Table.RemoveColumns(#"Added Main Index", List.Select(Table.ColumnNames(#"Added Main Index"), each _ <> xColumn and _ <> yColumn and _ <> "InterpolateMainIndex")),
#"Remove Blanks" = Table.SelectRows(#"Two Columns and Index", each Record.Field(_, yColumn) <> null and Record.Field(_, yColumn) <> ""),
//index for refering to next non-blank record
#"Added Sub Index" = Table.AddIndexColumn(#"Remove Blanks", "InterpolateSubIndex", 0, 1),
//m = (y2 - y1) / (x2 - x1)
m = Table.AddColumn(#"Added Sub Index",
"m",
each (Number.From(Record.Field(_, yColumn))-Number.From(Record.Field(#"Added Sub Index"{[InterpolateSubIndex]+1}, yColumn))) /
(Number.From(Record.Field(_, xColumn))-Number.From(Record.Field(#"Added Sub Index"{[InterpolateSubIndex]+1}, xColumn))),
type number),
//b = y - m * x
b = Table.AddColumn(m, "b", each Record.Field(_, yColumn) - [#"m"] * Number.From(Record.Field(_, xColumn)), type number),
//rename or remove columns to allow full join
#"Renamed Columns" = Table.RenameColumns(b,{{"InterpolateMainIndex", "InterpolateMainIndexCopy"}}),
xColumnmb = Table.RemoveColumns(#"Renamed Columns",{yColumn, xColumn, "InterpolateSubIndex"}),
Join = Table.Join(#"Added Main Index", "InterpolateMainIndex", xColumnmb, "InterpolateMainIndexCopy", JoinKind.FullOuter),
//enforce orignal sorting
#"Sorted by Main Index" = Table.Sort(Join,{{"InterpolateMainIndex", Order.Ascending}}),
#"Filled Down mb" = Table.FillDown(#"Sorted by Main Index",{"m", "b"}),
//y = m * x + b
Interpolate = Table.ReplaceValue(#"Filled Down mb",null,each ([m] * Number.From(Record.Field(_, xColumn)) + [b]),Replacer.ReplaceValue,{yColumn}),
//clean up
#"Remove Temporary Columns" = Table.RemoveColumns(Interpolate,{"m", "b", "InterpolateMainIndex", "InterpolateMainIndexCopy"}),
#"Restore Types" = Value.ReplaceType(#"Remove Temporary Columns", Value.Type(Input))
in
#"Restore Types"