SQL(Snowflake) - 从数字提取中删除重复项
SQL(Snowflake) - Removing duplicates from number extraction
我想从字符串中提取这些数字,但只保留唯一数字。也有包含 3 个以上唯一数字的字符串。
如何在删除重复项的同时从字符串中提取所有数字? (大多数数字的长度为 7-8 个字符)
SELECT REGEXP_SUBSTR(REPLACE(string,' '),'[[:digit:]]{8}',1,1)||' '||REGEXP_SUBSTR(REPLACE(string,' '),'[[:digit:]]{8}',1,1)||' '||REGEXP_SUBSTR(REPLACE(string,' '),'[[:digit:]]{8}',1,1) as Num_Value
FROM ( *select DESCRIPTION as string
from...*)
)
WHERE Num_Value IS NOT NULL
字符串
- “合同号。02241899、02749981,(如...”
- “合同号。02515351,02747764,02707694(作为...”
- “合同号。02667112,(作为...”
我的结果
- 02241899 02241899 02241899
- 02515351 02515351 02515351
- 02667112 02667112 02667112
我在找什么
- 02241899 02749981
- 02515351 02747764 02707694
- 02667112
一种方法是不使用 regex
,而是根据可靠的分隔符
拆分为 table
with t1 (id, str) as
(select 1, 'Contract No(s). 02241899, 02749981, (as...')
select distinct
t1.id,
t1.str,
t2.value as contact
from t1,lateral split_to_table(replace(t1.str,' ',','), ',') t2
where try_cast(t2.value as integer) is not null and
len(t2.value) in (7,8);
我想从字符串中提取这些数字,但只保留唯一数字。也有包含 3 个以上唯一数字的字符串。
如何在删除重复项的同时从字符串中提取所有数字? (大多数数字的长度为 7-8 个字符)
SELECT REGEXP_SUBSTR(REPLACE(string,' '),'[[:digit:]]{8}',1,1)||' '||REGEXP_SUBSTR(REPLACE(string,' '),'[[:digit:]]{8}',1,1)||' '||REGEXP_SUBSTR(REPLACE(string,' '),'[[:digit:]]{8}',1,1) as Num_Value
FROM ( *select DESCRIPTION as string
from...*)
)
WHERE Num_Value IS NOT NULL
字符串
- “合同号。02241899、02749981,(如...”
- “合同号。02515351,02747764,02707694(作为...”
- “合同号。02667112,(作为...”
我的结果
- 02241899 02241899 02241899
- 02515351 02515351 02515351
- 02667112 02667112 02667112
我在找什么
- 02241899 02749981
- 02515351 02747764 02707694
- 02667112
一种方法是不使用 regex
,而是根据可靠的分隔符
with t1 (id, str) as
(select 1, 'Contract No(s). 02241899, 02749981, (as...')
select distinct
t1.id,
t1.str,
t2.value as contact
from t1,lateral split_to_table(replace(t1.str,' ',','), ',') t2
where try_cast(t2.value as integer) is not null and
len(t2.value) in (7,8);