使用 SQL 从列中提取除最后一个单词之外的所有文本

Question

假设我的项目 table 中有一列名为 name:

name
----
Wrench
Hammer (label1)
Screwdriver (label1) (label2)
Tape Measure (label1) (label2) (label3)

我想编写一个 PostgreSQL 查询来提取除最后一个标签（如果存在）之外的所有文本。所以鉴于上面的数据，我想结束：

substring
---------
Wrench
Hammer
Screwdriver (label1)
Tape Measure (label1) (label2)

我该怎么做？

Answer 1

使用子字符串和正则表达式。

语法是：

substring(string, regularExpression)

正则表达式应该用()来分隔提取字符串的哪一部分。例如：

substring('abcef', 'b(..)')

将return'ce'，b后面的两个字符。如果正则表达式与字符串不匹配它 returns NULL.

具体在这种情况下：

dmg@[local] # select substring('Hammer (label1)' from '^(.+)\([^\)]+\)$')   ;
 substring 
-----------
 Hammer 
(1 row)

dmg@[local] # select substring('Tape Measure (label1) (label2) (label3)' from '^(.+)\([^\)]+\)$')   ;
            substring            
---------------------------------
 Tape Measure (label1) (label2) 
(1 row)

Answer 2

regexp_replace() 是一种简单的处理方式：

select v.name,
       regexp_replace(name, ' [(][^)]+[)]$', '')
from (values 
       ('Wrench'),
       ('Hammer (label1)'),
       ('Screwdriver (label1) (label2)'),
       ('Tape Measure (label1) (label2) (label3)')
     ) v(name);

Here 是一个 db<>fiddle.

使用 SQL 从列中提取除最后一个单词之外的所有文本

Extract all text except the last word from a column using SQL

sql

postgresql

substring