解决 Vertica 中的查询错误 [Vertica][VJDBC](4160) 和 [Vertica][VJDBC](4680)

Solving Query-Errors in Vertica [Vertica][VJDBC](4160) and [Vertica][VJDBC](4680)

我在使 Vertica 查询正常工作方面遇到了一些问题。 假设我有一个定义如下的关系:

CREATE TABLE KOMM (
   MANDT         VARCHAR(3),
   DOCNUM        VARCHAR(16),
   COUNTER       VARCHAR(3),
   NUM           VARCHAR(6),
   NAM           VARCHAR(30), 
   INNUM         VARCHAR(6),
   KOMMLEVEL     VARCHAR(2),  
   MSG           VARCHAR(1000),
   NUM_UNH       VARCHAR(6)
);

并插入一些示例值:

insert into KOMM values ('200','45320824','000','000003','START','000002','02','START OF MESSAGE');
insert into KOMM values ('200','45320824','000','000004','INTERMED','000003','03','EXAMPLEEXAMPLEEXAMPLE');
insert into KOMM values ('200','45320824','000','000005','ADV_01','000003','03','TESTADV1');
insert into KOMM values ('200','45320824','000','000011','END','000010','04','01234567');
...
insert into KOMM values ('200','45320824','000','000022','START','000002','02','CONTINUE START OF MESSAGE');
insert into KOMM values ('200','45320824','000','000023','INTERMED','000003','03','SECONDEXAMPLEEXAMPLEEXAMPLE');
insert into KOMM values ('200','45320824','000','000024','ADV_01','000003','03','SECONDTESTADV1');
insert into KOMM values ('200','45320824','000','000030','END','000010','04','01234567');

现在,我想通过以下查询查询关系:

UPDATE KOMM E
SET NUM_UNH = (SELECT MAX(X.NUM)
                     FROM KOMM X
                    WHERE X.NAM IN ('START')
                      AND X.MANDT = E.MANDT
                      AND X.DOCNUM = E.DOCNUM
                      AND X.NUM <= E.NUM
                  )
FROM KOMM X
WHERE E.MANDT = X.MANDT AND E.DOCNUM = X.DOCNUM
;

然而,此查询抛出以下错误:

Execution error: [Vertica]VJDBC ERROR: Non-equality correlated subquery expression is not supported

我认为这是因为 Vertica 不允许在子查询中进行 <=、>=、< 和 > 比较? See Vertica Documentation for Subquery Restrictions

所以我尝试使用BETWEEN来解决它:

UPDATE KOMM E
SET NUM_UNH = (SELECT max(X.NUM)
                     FROM KOMM X
                    WHERE X.NAM IN ('START')
                      AND X.MANDT = E.MANDT
                      AND X.DOCNUM = E.DOCNUM
                      AND X.NUM BETWEEN '000000' AND (E.NUM)
                  )
from KOMM X
where E.MANDT = X.MANDT and E.DOCNUM = X.DOCNUM
;

这会导致相同的错误:

Execution error: [Vertica]VJDBC ERROR: Non-equality correlated subquery expression is not supported

所以我在执行以下查询后尝试忽略条件和运行进入另一个问题:

UPDATE KOMM E
   SET NUM_UNH = (SELECT max(X.NUM)
                         FROM KOMM X
                        WHERE X.NAM IN ('START')
                          AND X.MANDT = E.MANDT
                          AND X.DOCNUM = E.DOCNUM
                      )
from KOMM X
where E.MANDT = X.MANDT and E.DOCNUM = X.DOCNUM
;

导致以下错误:

Execution error: [Vertica]VJDBC ERROR: Self joins in UPDATE statements are not allowed [Vertica][VJDBC]Detail: Target relation "da592a51-45ee-4d3e-9983-e8a3e56fd852_2fd1ec98-bb71-4ad0-8d33-d751e209dcdd".KOMM also appears in the FROM list

我通过将 "from KOMM X" 替换为 "from (select * from KOMM) X" 找到了针对此问题的 "workaround"。 该查询确实会执行,但并不如您所愿(如您所想)。 目标是用 NUM 值更新 table,直到下一个更高的 NUM 值出现在 table 中,这样 table 最终可以通过仅显示包含NAM 是 'START':

SELECT
M.MANDT, M.DOCNUM, M.NUM_UNH,
max(case
  when M.NAM = 'START' then substring(cast(M.MSG as varchar(99)),15,6)
end) as UNH_SEG,
max(case
    when M.NAM = 'END'
    then substring(cast(M.MSG as varchar(36)),4,33)
end) as PMSG
from KOMM M
group by M.MANDT, M.DOCNUM, NUM_UNH
;

First row of result Second row of result

不幸的是,我无法找到解决这些问题的方法,这就是为什么我希望你们能帮助我。 预先感谢您的帮助和建议!

此致, 摩多

这是你想要的吗?

UPDATE KOMM E
SET NUM_UNH = (SELECT MAX(CASE WHEN X.NUM <= E.NUM THEN X.NUM END)
               FROM KOMM X
               WHERE X.NAM IN ('START') AND
                     X.MANDT = E.MANDT AND
                     X.DOCNUM = E.DOCNUM
              );

我不熟悉 Vertica 更新的限制,但这是一个更简单的查询并且可能有效。

编辑:

这个有用吗?

UPDATE KOMM E
    SET NUM_UNH = X.NUM
FROM (SELECT MANDT, DOCNUM, MAX(CASE WHEN X.NUM <= E.NUM THEN X.NUM END) as NUM
      FROM KOMM X
      WHERE X.NAM IN ('START')  
      GROUP BY MANDT, DOCNUM                     
     ) X
WHERE X.MANDT= E.MANDT AND X.DOCNUM = E.DOCNUM;

我理解你的问题,因此你想要在你的示例中包含两行的报告,其中包含 nam 为 'START' 的行的 NUM 值以及消息的最后 5 个字符下一行 nam 共 'END'。我唯一想不通的是你在世界上哪里得到了新列 unh_seg ...

中字符串 'GE' 的输入

我会尝试使用完全不同的方法。它涉及 GROUP-ing、OLAP 函数和嵌套查询。

我看到我们有两组行(如果我们按 num 排序)有一个 'START' 的序列,一个或多个其他的东西,和一个 'END' 在 nam 列中。

我们需要一个额外的列来区分两组。 Vertica 有一个非常独特的 OLAP 函数 CONDITIONAL_TRUE_EVENT(),在这里非常方便。它从 0 开始,每次括号中的布尔表达式为真时递增 1。

让我们创建并填充您的 table 数据类型,因为我会使用它们:

DROP TABLE IF EXISTS komm ;
-- note that I use numbers, especially integers, wherever I can
CREATE TABLE KOMM (
   mandt         INT,
   docnum        INT,
   counter       INT,
   num           INT,
   nam           VARCHAR(30), 
   innum         INT,
   kommlevel     INT,  
   msg           VARCHAR(64)
);

INSERT INTO KOMM
          SELECT 200,45320824,000,000003,'START',    2,2,'START OF MESSAGE'
UNION ALL SELECT 200,45320824,000,000004,'INTERMED', 3,3,'EXAMPLEEXAMPLEEXAMPLE'
UNION ALL SELECT 200,45320824,000,000005,'ADV_01',   3,3,'TESTADV1'
UNION ALL SELECT 200,45320824,000,000011,'END',      0,4,'01234567'
UNION ALL SELECT 200,45320824,000,000022,'START',    2,2,'CONTINUE START OF MESSAGE'
UNION ALL SELECT 200,45320824,000,000023,'INTERMED', 3,3,'SECONDEXAMPLEEXAMPLEEXAMPLE'
UNION ALL SELECT 200,45320824,000,000024,'ADV_01',   3,3,'SECONDTESTADV1'
UNION ALL SELECT 200,45320824,000,000030,'END',     10,4,'01234567'
;
COMMIT;
SELECT * FROM komm;
-- out  mandt |  docnum  | counter | num |   nam    | innum | kommlevel |             msg             
-- out -------+----------+---------+-----+----------+-------+-----------+-----------------------------
-- out    200 | 45320824 |       0 |   3 | START    |     2 |         2 | START OF MESSAGE
-- out    200 | 45320824 |       0 |   4 | INTERMED |     3 |         3 | EXAMPLEEXAMPLEEXAMPLE
-- out    200 | 45320824 |       0 |   5 | ADV_01   |     3 |         3 | TESTADV1
-- out    200 | 45320824 |       0 |  11 | END      |     0 |         4 | 01234567
-- out    200 | 45320824 |       0 |  22 | START    |     2 |         2 | CONTINUE START OF MESSAGE
-- out    200 | 45320824 |       0 |  23 | INTERMED |     3 |         3 | SECONDEXAMPLEEXAMPLEEXAMPLE
-- out    200 | 45320824 |       0 |  24 | ADV_01   |     3 |         3 | SECONDTESTADV1
-- out    200 | 45320824 |       0 |  30 | END      |    10 |         4 | 01234567
-- out (8 rows)

有了这样构建的 table,我 运行 下面的查询。

WITH
-- need a column to distinguish the two groups between 'START' and 'END'
-- hence a nested query to generate a "session id" ..
w_sess_id AS (
  SELECT
    CONDITIONAL_TRUE_EVENT(nam='START') OVER (
     ORDER BY num
    ) AS sess_id
  , *
  FROM komm
) 
SELECT
  mandt
, docnum
, MAX(CASE nam WHEN 'START' THEN num END) AS num_unh
    --^-- this returns NULL if nam is not 'START'
, 'GE' AS unh_seg -- I have no idea where you could get this from, so I put in a constant
, MAX(CASE nam WHEN 'END'   THEN  RIGHT(msg,5) END) AS msg
FROM w_sess_id
GROUP BY 
  sess_id
, mandt
, docnum;
-- out  mandt |  docnum  | num_unh | unh_seg |  msg  
-- out -------+----------+---------+---------+-------
-- out    200 | 45320824 |       3 | GE      | 34567
-- out    200 | 45320824 |      22 | GE      | 34567
-- out (2 rows)
-- out 
-- out Time: First fetch (2 rows): 46.210 ms. All rows formatted: 46.301 ms