将交叉表功能与 DISTINCT ON 相结合

Question

我有两个 tables details 和 data table。我已经加入了两个table并且交叉表功能已经完成

我只想显示每个 serial 的最新数据。请参阅下面的当前和所需输出。

问题： 如何在此交叉表查询中使用 DISTINCT ON？

Table details:

serial   | date                 |    line      |   total_judgement
---------+----------------------+--------------+----------------
 123     | 2016/05/21 12:00:00  |      A       |       1
 456     | 2016/05/21 12:02:00  |      A       |       0
 456     | 2016/05/21 12:05:00  |      A       |       0

Table data:

serial   |     date             | readings   |   value
---------+----------------------+------------+-------------
 123     | 2016/05/21 12:00:00  | reading1   |  1.2342
 123     | 2016/05/21 12:00:00  | reading2   |  2.3213
 123     | 2016/05/21 12:00:00  | reading3   |  3.4232
 456     | 2016/05/21 12:00:02  | reading1   |  1.2546
 456     | 2016/05/21 12:00:02  | reading2   |  2.3297
 456     | 2016/05/21 12:00:02  | reading3   |  3.4264
 456     | 2016/05/21 12:00:05  | reading1   |  1.9879
 456     | 2016/05/21 12:00:05  | reading2   |  2.4754
 456     | 2016/05/21 12:00:05  | reading3   |  3.4312

当前输出：

serial   | line |      date            | total_judgement| reading1  |   reading2  |   reading3  
---------+------+----------------------+----------------+-----------+-------------+--------------
123      |  A   |  2016/05/21 12:00:00 |       1        |  1.2342   |   2.3213    |   3.4232      
456      |  A   |  2016/05/21 12:00:02 |       0        |  1.2546   |   2.3297    |   3.4264 
456      |  A   |  2016/05/21 12:00:02 |       0        |  1.9879   |   2.4754    |   3.4312

期望输出：

serial   | line |     date             | total_judgement | reading1  |   reading2  |   reading3   
---------+------+----------------------+-----------------+-----------+-------------+--------------
123      |  A   |  2016/05/21 12:00:00 |         1       |  1.2342   |   2.3213    |   3.4232  
456      |  A   |  2016/05/21 12:00:05 |         0       |  1.9879   |   2.4754    |   3.4312

这是我的代码：

SELECT * FROM crosstab (
            $$ SELECT
                tb2.serial,
                tb1.line,
                tb2.date,
                tb1.total_judgement,
                tb2.readings,
                tb2.value
            FROM
                data tb2
                    INNER JOIN details tb1 ON (tb2.serial = tb1.serial 
             AND   tb2.date = tb1.date)
            ORDER BY tb2.date ASC $$,
            $$ VALUES ('reading1'),('reading2'),('reading3')$$
  ) as ct("S/N" VARCHAR (50),
          "Line" VARCHAR(3),
          "Date" TIMESTAMP,
          "TotalJudgement" CHARACTER(1),
          "Reading1" FLOAT8,
          "Reading2" FLOAT8,
          "Reading3" FLOAT8);

备注

我需要在 serial 和 date 上加入两个 table。

我认为 DISTINCT ON 可能对此有所帮助，但我使用 DISTINCT ON serial.

时似乎没有得到正确的结果

Answer 1

您的代码和数据似乎都与您的 "Current output" 不完全匹配。也许您可以通过以下方式（在交叉表函数内）使用 DISTINCT ON 获得所需的输出：

SELECT DISTINCT ON (tb2.serial, tb2.readings)
       tb2.serial,
       tb1.line,
       tb2.date,
       tb1.total_judge,
       tb2.readings,
       tb2.value
FROM data tb2
JOIN details tb1 ON (tb2.serno = tb1.serno AND   tb2.date = tb1.date)
ORDER BY tb2.serial, tb2.readings, tb2.date DESC

但这仅对您的数据有效，前提是您希望它遵循与此处示例数据相同的原则。否则，也许你可以更好地解释你有什么样的数据以及你想要什么输出。

Answer 2

做一些假设，这可能就是您要找的：

SELECT * FROM crosstab (
   $$
   SELECT t2.serial
        , t1.line
        , t2.date
        , t1.total_judgement
        , t2.readings
        , t2.value
   FROM  <b>(SELECT DISTINCT ON (serial) * FROM details ORDER BY serial, date DESC)</b> t1
   JOIN   data t2 USING (serial, date)
   ORDER  BY t2.serial
   $$
 , $$VALUES ('reading1'),('reading2'),('reading3')$$
   ) AS ct("S/N" text
         , "Line" text
         , "Date" timestamp
         , "TotalJudgement" text
         , "Reading1" float8
         , "Reading2" float8
         , "Reading3" float8);

如果在连接后应用DISTINCT ON (serial) ，您将只会保留data 中的一行。将 DISTINCT 步骤移动到 details 上的子查询中，以获取 data 中的所有读数，以获取 serial 中每个 serial 的最新行=13=].

顺便说一句，DISTINCT 和 DISTINCT ON ( expression [, ...] ) 不是 "functions" 而是 SQL 结构。基础知识：

Select first row in each GROUP BY group?

在做的过程中，我对代码做了一些简化。不是答案所必需的。

如果 table details 中每个 serial 有多行，使用其中一种技术可能更有效的 DISTINCT ON：

Optimize GROUP BY query to retrieve latest record per user

我不知道 table 定义、基数等

将交叉表功能与 DISTINCT ON 相结合

Combine crosstab function with DISTINCT ON

sql

postgresql

distinct

crosstab

greatest-n-per-group

备注