1 步 regexp_replace 而不是第二部分的两步,替换逗号分隔符之间多个单词任一侧的空格

1 step regexp_replace instead of two steps part ii, replace spaces either side of multiple words between comma delimiters

这是 “1 步 regexp_replace 而不是两步” 的后续,因为我没有提供足够的样本数据。 @hatless 为我提供了删除 , 之间的 space 的解决方案,@lemon 还建议我提供更多数据

spaces 应该从定界符之间的单词的任一侧删除。 “纽约”应该是“纽约” 单词两侧可能有 spaces,应该删除但现在在单词之间。逗号分隔字符串可以有任意数量的分隔符,最多 8 个逗号。

我可以对“、”和“,”进行多次嵌套替换,这适用于大多数情况,除非逗号前后有多个 space。可以用一个 regexp_replace 完成还是需要多个?

**"RESULT BEFORE" **
"university of washington, seattle, washington"
"university of washington, seattle , washington"
"university of washington, , washington"
"university of washington, seattle, washington"
"university of new york,ny , usa"
"university of new york,new york , usa"
"university of new york, new york , usa"
with t1 as 
(
select           1 id,"university of washington,           seattle, washington" location
union all select 2 id,"university of washington, seattle  , washington"
union all select 3 id,"university of washington,      , washington"
union all select 4 id,"university of washington, seattle            , washington"
union all select 5 id,"university of new york,ny  , usa"
union all select 6 id,"university of new york,new york  , usa"
union all select 7 id,"university of new york, new york  , usa"
)
select id,REGEXP_REPLACE(lower(location),r'([^,]+,)[, ]+', r'') location
from t1
order by 1;
**"DESIRED RESULT" **
"university of washington,seattle,washington"
"university of washington,seattle,washington"
"university of washington,washington"
"university of washington,seattle,washington"
"university of new york,ny,usa"
"university of new york,new york,usa"
"university of new york,new york,usa"

实际结果

id **location **
1 university of washington,seattle,washington
2 university of washington,seattle ,washington
3 university of washington,washington
4 university of washington,seattle ,washington
5 university of new york,ny ,usa
6 university of new york,new york ,usa
7 university of new york,new york ,usa

试试这个版本:

SELECT id, REGEXP_REPLACE(LOWER(location),
                          r'([^,]+)\s*,\s*([^,]*?)\s*,\s*(.*?)\s*',
                          r',,') AS location
FROM t1
ORDER BY 1;

这是一个工作 demo 显示正则表达式替换逻辑正在工作。

考虑以下

select id,
  regexp_replace(trim(location), r'\s*,\s*', ',') as location,
from t1
order by 1            

如果应用于您问题中的示例数据 - 输出为