删除不区分大小写的行(Snowflake)
Dedup rows with case insensitive (Snowflake)
我想在有多个实例时删除行。
原版table:
ID
姓名
1
苹果
2
香蕉
1
苹果
2
苹果
3
香蕉
去重后的期望输出(多例时小写优先):
ID
姓名
2
香蕉
1
苹果
2
苹果
3
香蕉
ID 1“Apple”已删除,因为 ID 1“apple”存在。
ID 2“APPLE”变为“apple”,因为存在 ID 1“apple”。
ID 3“BANANA”变成了“Banana”,因为小写优先。
以下语句仅适用于按 ID 分组。因此,ID 2“APPLE”保留为“APPLE”,ID 3“BANANA”保留为“BANANA”,这是不可取的。
create table DELETE2 as select ID, max(Name) as Name
FROM TEST."PUBLIC"."DELETE1"
group by ID, lower(Name);
drop table DELETE1;
alter table DELETE2 rename to DELETE1;
怎么样:
create table DELETE2 as
select ID, Name
from (
select ID, lower(Name) as Name1, max(Name) as Name
FROM TEST."PUBLIC"."DELETE1"
group by ID, lower(Name)
)
;
工作 SQL 您可以粘贴到 Snowflake 和 运行:
技术 ... 将所有单词变成字符数组 -> 将每个字符转换为 ascii ... 和 ascii。小写字母的 ascii 比大写字母高。
没有更新...没有功能...只是普通的旧 SQL ;-)
with cte as (
select 1 ID, 'Apple' name
union select 2 ID, 'Banana' name
union select 1 ID, 'apple' name
union select 2 ID, 'APPLE' name
union select 3 ID, 'BANANA' name ),
lu as (
select
name,
lower (name) lu_name,
sum(ascii(a.value :: string)) ac,
max(ac) over (partition by lower(name)) mac,
iff ( max(ac) over (partition by lower(name)) = sum(ascii(a.value :: string)),name, null) g
from
cte,
lateral flatten(
input => split(regexp_replace(name, '.', ',\0', 2), ',')
) a
group by 1,2
)
select
cte.id, lu.name
from
cte
left outer join lu on lower(cte.name) = lu.lu_name and lu.g is not null
group by 1, 2
我想在有多个实例时删除行。
原版table:
ID | 姓名 |
---|---|
1 | 苹果 |
2 | 香蕉 |
1 | 苹果 |
2 | 苹果 |
3 | 香蕉 |
去重后的期望输出(多例时小写优先):
ID | 姓名 |
---|---|
2 | 香蕉 |
1 | 苹果 |
2 | 苹果 |
3 | 香蕉 |
ID 1“Apple”已删除,因为 ID 1“apple”存在。 ID 2“APPLE”变为“apple”,因为存在 ID 1“apple”。 ID 3“BANANA”变成了“Banana”,因为小写优先。
以下语句仅适用于按 ID 分组。因此,ID 2“APPLE”保留为“APPLE”,ID 3“BANANA”保留为“BANANA”,这是不可取的。
create table DELETE2 as select ID, max(Name) as Name
FROM TEST."PUBLIC"."DELETE1"
group by ID, lower(Name);
drop table DELETE1;
alter table DELETE2 rename to DELETE1;
怎么样:
create table DELETE2 as
select ID, Name
from (
select ID, lower(Name) as Name1, max(Name) as Name
FROM TEST."PUBLIC"."DELETE1"
group by ID, lower(Name)
)
;
工作 SQL 您可以粘贴到 Snowflake 和 运行:
技术 ... 将所有单词变成字符数组 -> 将每个字符转换为 ascii ... 和 ascii。小写字母的 ascii 比大写字母高。
没有更新...没有功能...只是普通的旧 SQL ;-)
with cte as (
select 1 ID, 'Apple' name
union select 2 ID, 'Banana' name
union select 1 ID, 'apple' name
union select 2 ID, 'APPLE' name
union select 3 ID, 'BANANA' name ),
lu as (
select
name,
lower (name) lu_name,
sum(ascii(a.value :: string)) ac,
max(ac) over (partition by lower(name)) mac,
iff ( max(ac) over (partition by lower(name)) = sum(ascii(a.value :: string)),name, null) g
from
cte,
lateral flatten(
input => split(regexp_replace(name, '.', ',\0', 2), ',')
) a
group by 1,2
)
select
cte.id, lu.name
from
cte
left outer join lu on lower(cte.name) = lu.lu_name and lu.g is not null
group by 1, 2