在 SQL 中重新创建移动中位数和移动模式 Excel 公式

Recreate Moving Median and Moving Mode Excel formula in SQL

我正在尝试重新创建以下 Excel formula/table 并显示 True/False 但卡住了。

Excel 公式:=ABS(ROUND(MEDIAN(C$2:C2),0)-ROUND(MODE.SNGL(C$2:C2),0)) <[sample.xlsx]变量!$B$2

我有 200 多行数据,我只需要计算从第一行到当前行的中位数和众数。我可以为 SQL 中的所有行创建中位数,但这不符合我的需要。与模式相同。上面的公式将在 Excel 单元格 D2 中并向下填充。它在公式末尾调用的变量只是数字 4.

任何建议或指示都会很棒。谢谢!

Excel 片段: Quick view of how this table looks in Excel

SQL 用于在 SQL.

中构建此确切 table 的代码
    IF NOT EXISTS (
    select * from sysobjects where name='SampleExample' and xtype='U'
) CREATE TABLE SampleExample (
    [Seconds] INT,
    [Sequence] INT,
    [Value] NUMERIC(12, 9),
    [Result] NVARCHAR(4)
);
INSERT INTO SampleExample VALUES
    (598,1,236.888453364,N'#N/A'),
    (740,2,236.888453364,N'True'),
    (885,3,235.463708639,N'True'),
    (1024,4,236.177295446,N'True'),
    (1189,5,236.177295446,N'True'),
    (1330,6,236.866638064,N'True'),
    (1463,7,236.177295446,N'True'),
    (1599,8,236.866638064,N'True'),
    (1735,9,236.866638064,N'True'),
    (1863,10,236.866638064,N'True'),
    (1986,11,236.866638064,N'True'),
    (2110,12,236.866638064,N'True'),
    (2235,13,236.880749464,N'True'),
    (2362,14,236.908763647,N'True'),
    (2487,15,236.908763647,N'True'),
    (2610,16,236.908763647,N'True'),
    (2739,17,237.190827727,N'True'),
    (2865,18,237.190827727,N'True'),
    (3008,19,237.190827727,N'True'),
    (3132,20,237.190827727,N'True');

当前中位数查询。我在我的 SQL table 中添加了一个名为 Filename 的列,该列对于所有行都是相同的值。但这会找到 tablet 中所有行的中位数,而不是第 1 行到当前行。

Declare @Median AS INT
Select @Median = ( 
    (Select MAX([Value])
    FROM 
        (Select TOP 50 PERCENT [Value], [Filename] 
        FROM SampleExample 
        
        Order by [Filename]) as BOTTOMHALF)
    + 
    (Select MIN([Value])
    FROM
        (Select TOP 50 PERCENT [Value], [Filename]
        FROM SampleExample 
        
        Order by [Filename] desc) as TOPHALF) ) / 2 

当前模式查询:

Declare @Mode as INT
Select @Mode = (
    Select TOP 1 ROUND([Value],0) as MODE
    from SampleExample 
    Group by [Value]
    Order by COUNT(*) DESC
    )

我要查找的结果是 True/False。我在 SQL 查询中使用 CASE:

CASE WHEN @Variable > @Median - @Mode THEN 'True' ELSE 'False' END AS Result

SQL 服务器(通常是 SQL)具有计算中位数的功能。它有一个直观的名字 percentile_cont()。而且,它仅作为 window 函数存在,而不是聚合函数。

你想要一个 运行 中位数。理想情况下,这样写会很好:

select se.*,
       avg(value) over (order by sequence) as avg_value,
       percentile_cont(0.5) over (within group order by sequence) over (order by sequence)
from sampleexample se;

但不支持累积中位数。所以,剩下 apply 选项:

select se.*, se2.*
from sampleexample se cross apply
     (select top (1)  percentile_cont(0.5) within group (order by value) over () as median,
             avg(value) over () as avg_value
      from sampleexample se2
      where se2.sequence <= se.sequence
     ) se2;

Here 是一个 db<>fiddle.

编辑:

我真的把这个问题看成中位数和平均值,而不是中位数和众数(我一厢情愿)。对于模式,你确实需要一个子查询,所以:

select se.*, se2.*
from sesampleexample se cross apply
     (select top (1) percentile_cont(0.5) within group (order by value) over () as median,
             avg(value) over () as avg_value,
             value as mode
      from (select se2.*, count(*) over (partition by se2.value) as value_cnt
            from sampleexample se2
            where se2.sequence <= se.sequence
           ) se2
      order by se2.value_cnt desc
     ) se2