在第 n 个字符后查找 space 并拆分为新行

Find space after nth characters and split into new row

我有一个大字符串作为单行存储在 table 中。我需要一个 select 查询,以在每 100 个字符后将大字符串拆分为行,并且不应在单词中间拆分。基本上,查询应该在 100 个字符后找到 space 并拆分为新行。

我用过这个查询,它在 100 行后拆分,但在单词中间断开。

SELECT  REGEXP_REPLACE ( col_large_string , '(.{100})' , '' || CHR (10) ) AS split_to_rows
FROM tab_large_string where string_id = 1;

您可以使用这个正则表达式:

SELECT  REGEXP_REPLACE ( col_large_string , '((\w+\s+){100})' , '' || CHR (10) ) AS split_to_rows
FROM tab_large_string where string_id = 1;

\w+ 匹配一次或多次出现的单词字符。
\s+ 匹配一次或多次出现的 space 字符。
(\w+\s+) 匹配后跟 space
的单词 (\w+\s+){100} 然后匹配(一个词后跟 space)x100。

您不需要(慢)正则表达式,可以使用简单(更快)的字符串函数来完成。

如果您想用换行符替换空格,那么:

WITH bounds ( str, end_pos ) AS (
  SELECT col_large_string,
         INSTR(col_large_string, ' ', 101)
  FROM   tab_large_string
UNION ALL
  SELECT SUBSTR(str, 1, end_pos - 1)
         || CHR(10)
         || SUBSTR(str, end_pos + 1),
         INSTR(str, ' ', end_pos + 101)
  FROM   bounds
  WHERE  end_pos > 0
)
SELECT str AS split_to_lines
FROM   bounds
WHERE  end_pos = 0;

如果您希望每一行都在一个新行中,则:

WITH bounds ( str, start_pos, end_pos ) AS (
  SELECT col_large_string,
         1,
         INSTR(col_large_string, ' ', 101)
  FROM   tab_large_string
UNION ALL
  SELECT str,
         end_pos + 1,
         INSTR(str, ' ', end_pos + 101)
  FROM   bounds
  WHERE  end_pos > 0
)
SELECT CASE end_pos
       WHEN 0
       THEN SUBSTR(str, start_pos)
       ELSE SUBSTR(str, start_pos, end_pos - start_pos)
       END AS split_to_rows
FROM   bounds;

如果您确实想使用正则表达式,那么:

SELECT REGEXP_REPLACE(
         col_large_string,
         '(.{100,}?) ',
         '' || CHR (10)
       ) AS split_to_lines
FROM   tab_large_string
WHERE  string_id = 1;

db<>fiddle here