甲骨文:instr+substr 而不是 regexp_substr

Oracle: instr+substr instead of regexp_substr

我从我做的另一个 post 得到了这个查询,它使用 REGEXP_SUBSTR() 从 oracle 中的字符串中提取特定信息。它工作得很好,但只适用于少量数据。当涉及到拥有 300,000 条以上记录的表时,它非常慢,我正在阅读 instr + substr 可能会更快。示例查询是:

SELECT REGEXP_SUBSTR(value, '(^|\|)\s*24=\s*(.*?)\s*(\||$)',  1, 1, NULL, 2)  AS "24",
       REGEXP_SUBSTR(value, '(^|\|)\s*35=\s*(.*?)\s*(\||$)',  1, 1, NULL, 2)  AS "35",
       REGEXP_SUBSTR(value, '(^|\|)\s*47A=\s*(.*?)\s*(\||$)', 1, 1, NULL, 2) AS "47A",
       REGEXP_SUBSTR(value, '(^|\|)\s*98A=\s*(.*?)\s*(\||$)', 1, 1, NULL, 2) AS "98A"
FROM   table_name

Table 例子:

CREATE TABLE table_name (value ) AS
SELECT '35= 88234.00 | 47A= Shawn | 98A= This is a comment |' FROM DUAL UNION ALL
SELECT '24= 123.00 | 98A= This is a comment | 47A= Derick |' FROM DUAL

查询输出为:

24 35 47A 98A
88234.00 Shawn This is a comment
123.00 Derick This is a comment

谁能给我一个例子,说明如果我改用 instr+substr,这个查询会是什么样子?

谢谢。

SELECT CASE 
       WHEN start_24 > 0
       THEN TRIM(
              SUBSTR(
                value,
                start_24 + 5,
                INSTR(value, '|', start_24 + 5) - (start_24+5)
              )
           )
       END AS "24",
       CASE 
       WHEN start_35 > 0
       THEN TRIM(
              SUBSTR(
                value,
                start_35 + 5,
                INSTR(value, '|', start_35 + 5) - (start_35+5)
              )
           )
       END AS "35",
       CASE 
       WHEN start_47a > 0
       THEN TRIM(
              SUBSTR(
                value,
                start_47a + 6,
                INSTR(value, '|', start_47a + 6) - (start_47a+6)
              )
           )
       END AS "47A",
       CASE 
       WHEN start_98a > 0
       THEN TRIM(
              SUBSTR(
                value,
                start_98a + 6,
                INSTR(value, '|', start_98a + 6) - (start_98a+6)
              )
           )
       END AS "98A"
FROM   (
  SELECT value,
         INSTR(value, '| 24=') AS start_24,
         INSTR(value, '| 35=') AS start_35,
         INSTR(value, '| 47A=') AS start_47a,
         INSTR(value, '| 98A=') AS start_98a
  FROM   (
    SELECT '| ' || value AS value FROM table_name
  )
);

对于您的示例数据,输出:

24 35 47A 98A
88234.00 Shawn This is a comment
123.00 Derick This is a comment

db<>fiddle here

鉴于您示例中的数据,您似乎也可以使用程序方法来提取数据,但我怀疑这是否可以更快。

例如,以下函数 get24 仅使用 INSTRSUBSTR.

提取列“24”
CREATE OR REPLACE FUNCTION get24(value IN VARCHAR2) RETURN VARCHAR2
IS
    i PLS_INTEGER;
    s VARCHAR2(32767);
BEGIN
  i := INSTR(value, '24= ');
  IF (i <> 1) THEN
    RETURN NULL;
  END IF;
  s := SUBSTR(value, i + 4);
  i := INSTR(s, ' | ');
  IF (i = 0) THEN
    RETURN NULL;
  END IF;
  RETURN SUBSTR(s, 1, i - 1);
END;
/

SELECT get24(value) "24" FROM table_name;

您还可以尝试使用流水线函数,并在流水线函数内提取所有数据。