Window 函数:last_value(ORDER BY ... ASC) 与 last_value(ORDER BY ... DESC) 相同

Window Functions: last_value(ORDER BY ... ASC) same as last_value(ORDER BY ... DESC)

样本数据

CREATE TABLE test
    (id integer, session_ID integer, value integer)
;

INSERT INTO test
    (id, session_ID, value)
VALUES
    (0, 2, 100),
    (1, 2, 120),
    (2, 2, 140),
    (3, 1, 900),
    (4, 1, 800),
    (5, 1, 500)
;

当前查询

select 
id,
last_value(value) over (partition by session_ID order by id) as last_value_window,
last_value(value) over (partition by session_ID order by id desc) as last_value_window_desc
from test
ORDER BY id

我 运行 遇到了 last_value() window 函数的问题: http://sqlfiddle.com/#!15/bcec0/2

在 fiddle 中,我尝试在 last_value() 查询中使用排序方向。

编辑: 问题不是:为什么我没有得到所有时间的最后一个值以及如何使用 frame 子句(unbounded precedingunbounded following)。我知道 first_value(desc)last_value() 的区别以及 last_value() 没有给你历史上一个值的问题:

默认框架子句在当前行之前是无限的。所以第一个值总是给出带有子句的第一行。因此,只有一行(框架子句只包括这一行)还是一排(框架子句包括所有一百)都没有关系。结果总是第一个。在 DESC 顺序中,它是相同的:DESC 更改排序顺序,然后第一行是最后一个值,无论您获得多少行。

last_value() 的行为非常相似:如果你有一行,它会为你提供默认框架子句的最后一个值:这一行。在第二行,框架子句包含两行,最后一行是第二行。这就是为什么 last_value() 不给你所有行的最后一行,而是只给你直到当前行的最后一行。

但是,如果我将顺序更改为 DESC,我希望最后一行排在第一行,所以我在第一行得到这一行,而不是第二行的最后一行,依此类推。但这不是结果。为什么?

对于当前示例,这些是 first_value()first_value(desc)last_value()last_value(desc) 的结果以及我对 last_value(desc) 的期望:

 id | fv_asc | fv_desc | lv_asc | lv_desc | lv_desc(expecting)
----+--------+---------+--------+---------+--------------------
  0 |    100 |     140 |    100 |     100 |    140
  1 |    100 |     140 |    120 |     120 |    120
  2 |    100 |     140 |    140 |     140 |    100
  3 |    900 |     500 |    900 |     900 |    500
  4 |    900 |     500 |    800 |     800 |    800
  5 |    900 |     500 |    500 |     500 |    900

对我来说,似乎 ORDER BY DESC 标志在默认框架子句 last_value() 调用中被忽略了。但它不在 first_value() 调用中。所以我的问题是:为什么 last_value() 结果与 last_value(desc) 相同?

LAST_VALUE() 的问题是窗口子句的默认规则删除了您真正想要的值。这是一个非常微妙的问题,在所有支持此功能的数据库中都是如此。

This 来自 Oracle 博客:

Whilst we are on the topic of windowing clauses, the implicit and unchangeable window clause for the FIRST and LAST functions is ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING, in other words all rows in our partition. For FIRST_VALUE and LAST_VALUE the default but changeable windowing clause is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, in other words we exclude rows after the current one. Dropping rows off the bottom of a list makes no difference when we are looking for the first row in the list (FIRST_VALUE) but it does make a difference when we are looking for the last row in the list (LAST_VALUE) so you will usually need either to specify ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING explicitly when using LAST_VALUE or just use FIRST_VALUE and reverse the sort order.

因此,只需使用FIRST_VALUE()。这就是你想要的:

with test (id, session_ID, value) as (
      (VALUES (0, 2, 100),
              (1, 2, 120),
              (2, 2, 140),
              (3, 1, 900),
              (4, 1, 800),
              (5, 1, 500)
      )
     )
select id,
       first_value(value) over (partition by session_ID order by id) as first_value_window,
       first_value(value) over (partition by session_ID order by id desc) as first_value_window_desc
from test
order by id

Check how the window frame is defined。这个例子可能有帮助:

select 
    id,
    last_value(value) over (
        partition by session_id
        order by id
    ) as lv_asc,
    last_value(value) over (
        partition by session_id
        order by id desc
    ) as lv_desc,
    last_value(value) over (
        partition by session_id
        order by id
        rows between unbounded preceding and unbounded following
    ) as lv_asc_unbounded,
    last_value(value) over (
        partition by session_id
        order by id desc
        rows between unbounded preceding and unbounded following
    ) as lv_desc_unbounded
from t
order by id;
 id | lv_asc | lv_desc | lv_asc_unbounded | lv_desc_unbounded 
----+--------+---------+------------------+-------------------
  0 |    100 |     100 |              140 |               100
  1 |    120 |     120 |              140 |               100
  2 |    140 |     140 |              140 |               100
  3 |    900 |     900 |              500 |               900
  4 |    800 |     800 |              500 |               900
  5 |    500 |     500 |              500 |               900

一年后我得到了解决方案:

接受这个声明:

SELECT
    id,
    array_accum(value) over (partition BY session_ID ORDER BY id)      AS window_asc,
    first_value(value) over (partition BY session_ID ORDER BY id)      AS first_value_window_asc,
    last_value(value) over (partition BY session_ID ORDER BY id)       AS last_value_window_asc,
    array_accum(value) over (partition BY session_ID ORDER BY id DESC) AS window_desc,
    first_value(value) over (partition BY session_ID ORDER BY id DESC) AS first_value_window_desc,
    last_value(value) over (partition BY session_ID ORDER BY id DESC)  AS last_value_window_desc
FROM
    test
ORDER BY
    id

这给出了

id  window_asc     first_value_window_asc  last_value_window_asc  window_desc    first_value_window_desc  last_value_window_desc  
--  -------------  ----------------------  ---------------------  -------------  -----------------------  ----------------------  
0   {100}          100                     100                    {140,120,100}  140                      100                     
1   {100,120}      100                     120                    {140,120}      140                      120                     
2   {100,120,140}  100                     140                    {140}          140                      140                     
3   {900}          900                     900                    {500,800,900}  500                      900                     
4   {900,800}      900                     800                    {500,800}      500                      800                     
5   {900,800,500}  900                     500                    {500}          500                      500           

array_accum 显示已使用 window。在那里您可以看到 window.

的第一个和当前最后一个值

发生了什么显示执行计划:

"Sort  (cost=444.23..449.08 rows=1940 width=12)"
"  Sort Key: id"
"  ->  WindowAgg  (cost=289.78..338.28 rows=1940 width=12)"
"        ->  Sort  (cost=289.78..294.63 rows=1940 width=12)"
"              Sort Key: session_id, id"
"              ->  WindowAgg  (cost=135.34..183.84 rows=1940 width=12)"
"                    ->  Sort  (cost=135.34..140.19 rows=1940 width=12)"
"                          Sort Key: session_id, id"
"                          ->  Seq Scan on test  (cost=0.00..29.40 rows=1940 width=12)"

在那里你可以看到:首先有一个 ORDER BY id 用于前三个 window 函数。

这给出(如问题所述)

id  window_asc     first_value_window_asc  last_value_window_asc  
--  -------------  ----------------------  ---------------------  
3   {900}          900                     900                    
4   {900,800}      900                     800                    
5   {900,800,500}  900                     500                    
0   {100}          100                     100                    
1   {100,120}      100                     120                    
2   {100,120,140}  100                     140    

然后您可以看到另一种排序:ORDER BY id DESC 用于接下来的三个 window 函数。这种排序给出:

id  window_asc     first_value_window_asc  last_value_window_asc  
--  -------------  ----------------------  ---------------------  
5   {900,800,500}  900                     500                    
4   {900,800}      900                     800                    
3   {900}          900                     900                    
2   {100,120,140}  100                     140                    
1   {100,120}      100                     120                    
0   {100}          100                     100                        

通过这种排序,DESC window 函数被执行。 array_accum 列显示结果 windows:

id  window_desc    
--  -------------  
5   {500}          
4   {500,800}      
3   {500,800,900}  
2   {140}          
1   {140,120}      
0   {140,120,100}  

结果(first_value DESC 和)last_value DESC 现在与 last_value ASC:

完全相同
id  window_asc     last_value_window_asc  window_desc    last_value_window_desc  
--  -------------  ---------------------  -------------  ----------------------  
5   {900,800,500}  500                    {500}          500                     
4   {900,800}      800                    {500,800}      800                     
3   {900}          900                    {500,800,900}  900                     
2   {100,120,140}  140                    {140}          140                     
1   {100,120}      120                    {140,120}      120                     
0   {100}          100                    {140,120,100}  100    

现在我明白了为什么 last_value ASC 等于 last_value DESC。这是因为 window 函数的第二个 ORDER 给出了倒置的 window.

(执行计划的最后排序是语句的最后ORDER BY。)

作为一点奖励:此查询显示了一点优化潜力:如果您先调用 DESC windows 然后调用 ASC,则不需要第三次排序。目前排序正确。