如何在 Oracle 中使用 UTL_MATCH 显示来自不同表的 2 列之间的差异?
How can I show the differences beetween 2 columns from different tables using UTL_MATCH in Oracle?
我想比较 Oracle DB 中的两列,如果字符串不同,我想在其他列中显示差异。我知道我错过了什么。
例如:
SELECT A.CD_KEY01,
A.TEXT_01,
B.TEXT_02,
UTL_MATCH.edit_distance_similarity(A.TEXT_01, B.TEXT_02) AS distance_similarity
FROM TB_TABLE_01 A
JOIN TB_TABLE_02 B
ON A.CD_KEY01 = B.CD_KEY02
我得到的示例输出:
CD_KEY01 | TEXT_01 | TEXT_02 | DISTANCE_SIMILARITY
111 | Superman is good | Superman is good | 100
222 | Superman is bad | Superman is bad | 100
333 | Superman is handsome | Hulk is ugly | 33
444 | Superman is awful | Batman is awful | 90
我需要的示例输出:
CD_KEY01 | TEXT_01 | TEXT_02 | DISTANCE_SIMILARITY | DIFF_01 | DIFF_02
111 | Superman is good | Superman is good | 100 | NULL | NULL
222 | Superman is bad | Superman is bad | 100 | NULL | NULL
333 | Superman is handsome | Hulk is ugly | 33 | Hulk | ugly
444 | Superman is awful | Batman is awful | 90 | Batman | NULL
我怀疑是否有简单的方法。一种方法是以某种方式将字符串分成单词(下面的递归方式),逐字比较和主结果:
with
a(key, text_a, word, rn) as (
select cd_key01, text_01, regexp_substr(text_01, '(\w+)', 1, 1), 1
from table_01
union all
select key, text_a, regexp_substr(text_a, '(\w+)', 1, rn + 1), rn + 1
from a
where regexp_substr(text_a, '(\w+)', 1, rn + 1) is not null),
b(key, text_b, word, rn) as (
select cd_key02, text_02, regexp_substr(text_02, '(\w+)', 1, 1), 1
from table_02
union all
select key, text_b, regexp_substr(text_b, '(\w+)', 1, rn + 1), rn + 1
from b where regexp_substr(text_b, '(\w+)', 1, rn + 1) is not null)
select *
from (
select key, rn, text_a, text_b,
case when a.word <> b.word then b.word end word
from a full join b using (key, rn))
pivot (max(word) for rn in (1 w1, 2 w2, 3 w3)) order by key
此查询显示前三个词的比较,is
也进行了比较。如果字符串可能有不同数量的单词,您必须小心并修改 case when
正确处理空值的部分。
我想比较 Oracle DB 中的两列,如果字符串不同,我想在其他列中显示差异。我知道我错过了什么。 例如:
SELECT A.CD_KEY01,
A.TEXT_01,
B.TEXT_02,
UTL_MATCH.edit_distance_similarity(A.TEXT_01, B.TEXT_02) AS distance_similarity
FROM TB_TABLE_01 A
JOIN TB_TABLE_02 B
ON A.CD_KEY01 = B.CD_KEY02
我得到的示例输出:
CD_KEY01 | TEXT_01 | TEXT_02 | DISTANCE_SIMILARITY
111 | Superman is good | Superman is good | 100
222 | Superman is bad | Superman is bad | 100
333 | Superman is handsome | Hulk is ugly | 33
444 | Superman is awful | Batman is awful | 90
我需要的示例输出:
CD_KEY01 | TEXT_01 | TEXT_02 | DISTANCE_SIMILARITY | DIFF_01 | DIFF_02
111 | Superman is good | Superman is good | 100 | NULL | NULL
222 | Superman is bad | Superman is bad | 100 | NULL | NULL
333 | Superman is handsome | Hulk is ugly | 33 | Hulk | ugly
444 | Superman is awful | Batman is awful | 90 | Batman | NULL
我怀疑是否有简单的方法。一种方法是以某种方式将字符串分成单词(下面的递归方式),逐字比较和主结果:
with
a(key, text_a, word, rn) as (
select cd_key01, text_01, regexp_substr(text_01, '(\w+)', 1, 1), 1
from table_01
union all
select key, text_a, regexp_substr(text_a, '(\w+)', 1, rn + 1), rn + 1
from a
where regexp_substr(text_a, '(\w+)', 1, rn + 1) is not null),
b(key, text_b, word, rn) as (
select cd_key02, text_02, regexp_substr(text_02, '(\w+)', 1, 1), 1
from table_02
union all
select key, text_b, regexp_substr(text_b, '(\w+)', 1, rn + 1), rn + 1
from b where regexp_substr(text_b, '(\w+)', 1, rn + 1) is not null)
select *
from (
select key, rn, text_a, text_b,
case when a.word <> b.word then b.word end word
from a full join b using (key, rn))
pivot (max(word) for rn in (1 w1, 2 w2, 3 w3)) order by key
此查询显示前三个词的比较,is
也进行了比较。如果字符串可能有不同数量的单词,您必须小心并修改 case when
正确处理空值的部分。