1 步 regexp_replace 而不是第二部分的两步,替换逗号分隔符之间多个单词任一侧的空格
1 step regexp_replace instead of two steps part ii, replace spaces either side of multiple words between comma delimiters
这是 “1 步 regexp_replace 而不是两步” 的后续,因为我没有提供足够的样本数据。
@hatless 为我提供了删除 , 之间的 space 的解决方案,@lemon 还建议我提供更多数据
spaces 应该从定界符之间的单词的任一侧删除。
“纽约”应该是“纽约”
单词两侧可能有 spaces,应该删除但现在在单词之间。逗号分隔字符串可以有任意数量的分隔符,最多 8 个逗号。
我可以对“、”和“,”进行多次嵌套替换,这适用于大多数情况,除非逗号前后有多个 space。可以用一个 regexp_replace 完成还是需要多个?
**"RESULT BEFORE" **
"university of washington, seattle, washington"
"university of washington, seattle , washington"
"university of washington, , washington"
"university of washington, seattle, washington"
"university of new york,ny , usa"
"university of new york,new york , usa"
"university of new york, new york , usa"
with t1 as
(
select 1 id,"university of washington, seattle, washington" location
union all select 2 id,"university of washington, seattle , washington"
union all select 3 id,"university of washington, , washington"
union all select 4 id,"university of washington, seattle , washington"
union all select 5 id,"university of new york,ny , usa"
union all select 6 id,"university of new york,new york , usa"
union all select 7 id,"university of new york, new york , usa"
)
select id,REGEXP_REPLACE(lower(location),r'([^,]+,)[, ]+', r'') location
from t1
order by 1;
**"DESIRED RESULT" **
"university of washington,seattle,washington"
"university of washington,seattle,washington"
"university of washington,washington"
"university of washington,seattle,washington"
"university of new york,ny,usa"
"university of new york,new york,usa"
"university of new york,new york,usa"
实际结果
id
**location **
1
university of washington,seattle,washington
2
university of washington,seattle ,washington
3
university of washington,washington
4
university of washington,seattle ,washington
5
university of new york,ny ,usa
6
university of new york,new york ,usa
7
university of new york,new york ,usa
试试这个版本:
SELECT id, REGEXP_REPLACE(LOWER(location),
r'([^,]+)\s*,\s*([^,]*?)\s*,\s*(.*?)\s*',
r',,') AS location
FROM t1
ORDER BY 1;
这是一个工作 demo 显示正则表达式替换逻辑正在工作。
考虑以下
select id,
regexp_replace(trim(location), r'\s*,\s*', ',') as location,
from t1
order by 1
如果应用于您问题中的示例数据 - 输出为
这是 “1 步 regexp_replace 而不是两步” 的后续,因为我没有提供足够的样本数据。 @hatless 为我提供了删除 , 之间的 space 的解决方案,@lemon 还建议我提供更多数据
spaces 应该从定界符之间的单词的任一侧删除。 “纽约”应该是“纽约” 单词两侧可能有 spaces,应该删除但现在在单词之间。逗号分隔字符串可以有任意数量的分隔符,最多 8 个逗号。
我可以对“、”和“,”进行多次嵌套替换,这适用于大多数情况,除非逗号前后有多个 space。可以用一个 regexp_replace 完成还是需要多个?
**"RESULT BEFORE" ** |
---|
"university of washington, seattle, washington" |
"university of washington, seattle , washington" |
"university of washington, , washington" |
"university of washington, seattle, washington" |
"university of new york,ny , usa" |
"university of new york,new york , usa" |
"university of new york, new york , usa" |
with t1 as
(
select 1 id,"university of washington, seattle, washington" location
union all select 2 id,"university of washington, seattle , washington"
union all select 3 id,"university of washington, , washington"
union all select 4 id,"university of washington, seattle , washington"
union all select 5 id,"university of new york,ny , usa"
union all select 6 id,"university of new york,new york , usa"
union all select 7 id,"university of new york, new york , usa"
)
select id,REGEXP_REPLACE(lower(location),r'([^,]+,)[, ]+', r'') location
from t1
order by 1;
**"DESIRED RESULT" ** |
---|
"university of washington,seattle,washington" |
"university of washington,seattle,washington" |
"university of washington,washington" |
"university of washington,seattle,washington" |
"university of new york,ny,usa" |
"university of new york,new york,usa" |
"university of new york,new york,usa" |
实际结果
id | **location ** |
---|---|
1 | university of washington,seattle,washington |
2 | university of washington,seattle ,washington |
3 | university of washington,washington |
4 | university of washington,seattle ,washington |
5 | university of new york,ny ,usa |
6 | university of new york,new york ,usa |
7 | university of new york,new york ,usa |
试试这个版本:
SELECT id, REGEXP_REPLACE(LOWER(location),
r'([^,]+)\s*,\s*([^,]*?)\s*,\s*(.*?)\s*',
r',,') AS location
FROM t1
ORDER BY 1;
这是一个工作 demo 显示正则表达式替换逻辑正在工作。
考虑以下
select id,
regexp_replace(trim(location), r'\s*,\s*', ',') as location,
from t1
order by 1
如果应用于您问题中的示例数据 - 输出为