使用 CASE 表达式从另一个创建 table

creating a table from another using CASE expression

给定 table t1 如下

+---------+----------+
| bin_val |  bin_cnt |
+---------+----------+
|       0 |        2 |
|       4 |       10 |
|       8 |       15 |
|      12 |       12 |
|      16 |        6 |
|      20 |        1 |
+---------+----------+

我需要在 Netezza 中从 table t1 创建一个临时文件 table bin_vals_selected,我按如下方式执行

CREATE TEMP TABLE bin_vals_selected as (
  -- statements
) DISTRIBUTE ON RANDOM;

我需要根据我的生产界面中可用的变量 $bin_selected 有条件地将语句从 t1 写入 select bin_val(它是一个简单的字符串在 SQL 传递给 Netezza 执行之前替换。

我在临时 table bin_vals_selected 中需要的内容如下。

$bin_selected = 'all' 时,bin_vals_selected 应包含来自 t1 的所有不同的 bin_val。对此的声明如下。

SELECT DISTINCT bin_val as bin_selected FROM t1

$bin_selected = 'first'时,bin_vals_selected应包含t1中的bin_val,其中bin_count最多。对此的声明如下。

SELECT bin_val as bin_selected FROM t1 ORDER BY bin_cnt DESC LIMIT 1

$bin_selected = 'second'时,bin_vals_selected应包含来自t1bin_val,其具有第二大bin_count。我不确定如何为此写声明。

我正在使用 CASE 表达式来处理此问题,以根据变量 $bin_selected 的值创建 table - 但它不起作用。

CREATE TEMP TABLE bin_vals_selected AS
(
  SELECT * FROM (
    CASE 
      WHEN $bin_selected = 'all' THEN
        (SELECT DISTINCT bin_val AS bin_selected FROM t1 AS a)
      WHEN $bin_selected = 'first' THEN
        (SELECT bin_val AS bin_selected FROM t1 AS a ORDER BY bin_cnt DESC LIMIT 1 )
    END
  )
) DISTRIBUTE ON RANDOM;

虽然上面的语法是以 Netezza 为中心的(主要类似于 Postgres),但 Postgres 解决方案也会有所帮助,因为我有本地 Postgres 实例可以尝试出。

我将从使用 group by 开始,然后是一个条件。您可以这样做:

select bin_val
from (select bin_val, ,  -- not really needed
             row_number() over (order by bin_cnt desc) as seqnum
      from t1
     ) t
where ($bin_selected = 'all') or
      ($bin_selected = 'first' and seqnum = 1) or
      ($bin_selected = 'second' and seqnum = 2);

如果您希望在 bin 具有相同计数时允许并列,请使用 dense_rank() 而不是 row_number()

如果你真的必须用一个 SQL 来完成它,你可以使用类似于以下的 UNION:

SELECT DISTINCT bin_val as bin_selected 
FROM t1
WHERE 'all' = $bin_selected

UNION ALL

SELECT bin_val as bin_selected
FROM (
  SELECT bin_val, RANK() OVER(ORDER BY bin_cnt DESC) AS BinCountRank
  FROM t1
) src
WHERE BinCountRank = 1
AND 'first' = $bin_selected

UNION ALL

SELECT bin_val as bin_selected
FROM (
  SELECT bin_val, RANK() OVER(ORDER BY bin_cnt DESC) AS BinCountRank
  FROM t1
) src
WHERE BinCountRank = 2
AND 'second' = $bin_selected

效率不是很高,但应该可以解决问题。它确实为您的源查询提供了灵活性。您可能不得不乱用 RANK() 来处理任何关系。它还假定 UNION 字符串中的每个查询 returns 具有相同数据类型的完全相同的 # 列。

对于你的情况,我会使用上面 Gordon 的回答...它更干净、更快。