在 SQL 中重新创建移动中位数和移动模式 Excel 公式
Recreate Moving Median and Moving Mode Excel formula in SQL
我正在尝试重新创建以下 Excel formula/table 并显示 True/False 但卡住了。
Excel 公式:=ABS(ROUND(MEDIAN(C$2:C2),0)-ROUND(MODE.SNGL(C$2:C2),0)) <[sample.xlsx]变量!$B$2
我有 200 多行数据,我只需要计算从第一行到当前行的中位数和众数。我可以为 SQL 中的所有行创建中位数,但这不符合我的需要。与模式相同。上面的公式将在 Excel 单元格 D2 中并向下填充。它在公式末尾调用的变量只是数字 4.
任何建议或指示都会很棒。谢谢!
Excel 片段:
Quick view of how this table looks in Excel
SQL 用于在 SQL.
中构建此确切 table 的代码
IF NOT EXISTS (
select * from sysobjects where name='SampleExample' and xtype='U'
) CREATE TABLE SampleExample (
[Seconds] INT,
[Sequence] INT,
[Value] NUMERIC(12, 9),
[Result] NVARCHAR(4)
);
INSERT INTO SampleExample VALUES
(598,1,236.888453364,N'#N/A'),
(740,2,236.888453364,N'True'),
(885,3,235.463708639,N'True'),
(1024,4,236.177295446,N'True'),
(1189,5,236.177295446,N'True'),
(1330,6,236.866638064,N'True'),
(1463,7,236.177295446,N'True'),
(1599,8,236.866638064,N'True'),
(1735,9,236.866638064,N'True'),
(1863,10,236.866638064,N'True'),
(1986,11,236.866638064,N'True'),
(2110,12,236.866638064,N'True'),
(2235,13,236.880749464,N'True'),
(2362,14,236.908763647,N'True'),
(2487,15,236.908763647,N'True'),
(2610,16,236.908763647,N'True'),
(2739,17,237.190827727,N'True'),
(2865,18,237.190827727,N'True'),
(3008,19,237.190827727,N'True'),
(3132,20,237.190827727,N'True');
当前中位数查询。我在我的 SQL table 中添加了一个名为 Filename 的列,该列对于所有行都是相同的值。但这会找到 tablet 中所有行的中位数,而不是第 1 行到当前行。
Declare @Median AS INT
Select @Median = (
(Select MAX([Value])
FROM
(Select TOP 50 PERCENT [Value], [Filename]
FROM SampleExample
Order by [Filename]) as BOTTOMHALF)
+
(Select MIN([Value])
FROM
(Select TOP 50 PERCENT [Value], [Filename]
FROM SampleExample
Order by [Filename] desc) as TOPHALF) ) / 2
当前模式查询:
Declare @Mode as INT
Select @Mode = (
Select TOP 1 ROUND([Value],0) as MODE
from SampleExample
Group by [Value]
Order by COUNT(*) DESC
)
我要查找的结果是 True/False。我在 SQL 查询中使用 CASE:
CASE WHEN @Variable > @Median - @Mode THEN 'True' ELSE 'False' END AS Result
SQL 服务器(通常是 SQL)具有计算中位数的功能。它有一个直观的名字 percentile_cont()
。而且,它仅作为 window 函数存在,而不是聚合函数。
你想要一个 运行 中位数。理想情况下,这样写会很好:
select se.*,
avg(value) over (order by sequence) as avg_value,
percentile_cont(0.5) over (within group order by sequence) over (order by sequence)
from sampleexample se;
但不支持累积中位数。所以,剩下 apply
选项:
select se.*, se2.*
from sampleexample se cross apply
(select top (1) percentile_cont(0.5) within group (order by value) over () as median,
avg(value) over () as avg_value
from sampleexample se2
where se2.sequence <= se.sequence
) se2;
Here 是一个 db<>fiddle.
编辑:
我真的把这个问题看成中位数和平均值,而不是中位数和众数(我一厢情愿)。对于模式,你确实需要一个子查询,所以:
select se.*, se2.*
from sesampleexample se cross apply
(select top (1) percentile_cont(0.5) within group (order by value) over () as median,
avg(value) over () as avg_value,
value as mode
from (select se2.*, count(*) over (partition by se2.value) as value_cnt
from sampleexample se2
where se2.sequence <= se.sequence
) se2
order by se2.value_cnt desc
) se2
我正在尝试重新创建以下 Excel formula/table 并显示 True/False 但卡住了。
Excel 公式:=ABS(ROUND(MEDIAN(C$2:C2),0)-ROUND(MODE.SNGL(C$2:C2),0)) <[sample.xlsx]变量!$B$2
我有 200 多行数据,我只需要计算从第一行到当前行的中位数和众数。我可以为 SQL 中的所有行创建中位数,但这不符合我的需要。与模式相同。上面的公式将在 Excel 单元格 D2 中并向下填充。它在公式末尾调用的变量只是数字 4.
任何建议或指示都会很棒。谢谢!
Excel 片段: Quick view of how this table looks in Excel
SQL 用于在 SQL.
中构建此确切 table 的代码 IF NOT EXISTS (
select * from sysobjects where name='SampleExample' and xtype='U'
) CREATE TABLE SampleExample (
[Seconds] INT,
[Sequence] INT,
[Value] NUMERIC(12, 9),
[Result] NVARCHAR(4)
);
INSERT INTO SampleExample VALUES
(598,1,236.888453364,N'#N/A'),
(740,2,236.888453364,N'True'),
(885,3,235.463708639,N'True'),
(1024,4,236.177295446,N'True'),
(1189,5,236.177295446,N'True'),
(1330,6,236.866638064,N'True'),
(1463,7,236.177295446,N'True'),
(1599,8,236.866638064,N'True'),
(1735,9,236.866638064,N'True'),
(1863,10,236.866638064,N'True'),
(1986,11,236.866638064,N'True'),
(2110,12,236.866638064,N'True'),
(2235,13,236.880749464,N'True'),
(2362,14,236.908763647,N'True'),
(2487,15,236.908763647,N'True'),
(2610,16,236.908763647,N'True'),
(2739,17,237.190827727,N'True'),
(2865,18,237.190827727,N'True'),
(3008,19,237.190827727,N'True'),
(3132,20,237.190827727,N'True');
当前中位数查询。我在我的 SQL table 中添加了一个名为 Filename 的列,该列对于所有行都是相同的值。但这会找到 tablet 中所有行的中位数,而不是第 1 行到当前行。
Declare @Median AS INT
Select @Median = (
(Select MAX([Value])
FROM
(Select TOP 50 PERCENT [Value], [Filename]
FROM SampleExample
Order by [Filename]) as BOTTOMHALF)
+
(Select MIN([Value])
FROM
(Select TOP 50 PERCENT [Value], [Filename]
FROM SampleExample
Order by [Filename] desc) as TOPHALF) ) / 2
当前模式查询:
Declare @Mode as INT
Select @Mode = (
Select TOP 1 ROUND([Value],0) as MODE
from SampleExample
Group by [Value]
Order by COUNT(*) DESC
)
我要查找的结果是 True/False。我在 SQL 查询中使用 CASE:
CASE WHEN @Variable > @Median - @Mode THEN 'True' ELSE 'False' END AS Result
SQL 服务器(通常是 SQL)具有计算中位数的功能。它有一个直观的名字 percentile_cont()
。而且,它仅作为 window 函数存在,而不是聚合函数。
你想要一个 运行 中位数。理想情况下,这样写会很好:
select se.*,
avg(value) over (order by sequence) as avg_value,
percentile_cont(0.5) over (within group order by sequence) over (order by sequence)
from sampleexample se;
但不支持累积中位数。所以,剩下 apply
选项:
select se.*, se2.*
from sampleexample se cross apply
(select top (1) percentile_cont(0.5) within group (order by value) over () as median,
avg(value) over () as avg_value
from sampleexample se2
where se2.sequence <= se.sequence
) se2;
Here 是一个 db<>fiddle.
编辑:
我真的把这个问题看成中位数和平均值,而不是中位数和众数(我一厢情愿)。对于模式,你确实需要一个子查询,所以:
select se.*, se2.*
from sesampleexample se cross apply
(select top (1) percentile_cont(0.5) within group (order by value) over () as median,
avg(value) over () as avg_value,
value as mode
from (select se2.*, count(*) over (partition by se2.value) as value_cnt
from sampleexample se2
where se2.sequence <= se.sequence
) se2
order by se2.value_cnt desc
) se2