需要数据库查询或逻辑优化
DB query or logic optimization needed
我正在对 200 万条记录执行以下查询 (oracle11g)。大约需要 2.2 秒。
我的查询:
select SUBSCRIBER_NUM, SUBSCRIBER_STATUS, P_ID
from C_S_FORWARD_INFO
where '07052620' LIKE SUBSCRIBER_NUM || '%'
and SCP_VER = 1
下面是我的table。
CREATE TABLE C_S_FORWARD_INFO
(
SUBSCRIBER_NUM VARCHAR2(30 BYTE) NOT NULL,
P_ID NUMBER,
SUBSCRIBER_STATUS NUMBER(1,0) DEFAULT 0 NOT NULL,
ACCOUNT_NUMBER INTEGER NOT NULL,
MAJOR_VERSION_ID NUMBER(10,0) DEFAULT 1 NOT NULL,
MINOR_VERSION_ID NUMBER(10,0) DEFAULT 1 NOT NULL,
SCP_VER NUMBER(1,0) DEFAULT 0 CHECK (SCP_VER IN (0,1)),
);
ALTER TABLE C_S_FORWARD_INFO ADD CONSTRAINT C_S_FORWARD_INFO_PK
PRIMARY KEY (SUBSCRIBER_NUM,ACCOUNT_NUMBER,MAJOR_VERSION_ID, MINOR_VERSION_ID);
DB 记录(例如,实际有 200 万条记录)
Row 1 => 07052620,1,1,10, 1, 1, 1;
Row 2 => 0705262,2,1,10, 1, 1, 1;
Row 3 => 070526,3,1,10, 1, 1, 1;
Row 4 => 070526200001,4, 1,10, 1, 1, 1;
Row 5 => 07052,5,1,10, 1, 1, 1;
......
预期结果:第 1 行(我通过上面的查询得到这个,逻辑:从最长的匹配开始直到 '07052620')
如何优化上述查询。或者编写任何其他逻辑以在 .2
秒内获得预期结果。在我的查询中,“07052620”是作为存储过程输入的动态数字。
20/11 - 更新:
我在下面试过(VAR_CALLING_NUM = 07052620):
while var1<=len LOOP
temp1 := SUBSTR(VAR_CALLING_NUM, 1, var1);
temp1 := concat('''',temp1);
temp1 := concat(temp1,'''');
temp6 := temp6 || temp1 || ',' ;
var1:=var1+1;
END LOOP;
temp6 := SUBSTR(temp6, 1,length(temp6)-1);
select SUBSCRIBER_NUM, SUBSCRIBER_STATUS, P_ID from C_S_FORWARD_INFO where SUBSCRIBER_NUM IN ( temp6 ) and SCP_VER = 1 order by length(subscriber_num) desc;
但这并没有给我结果。看起来查询没有动态获取 temp6。请帮助
以下查询向您的原始尝试添加了一个子查询,它将结果限制为匹配但也限制为最长匹配的结果。它 return 是任意一条记录,但如果您想 return 最长匹配,可以轻松修改它。
select SUBSCRIBER_NUM,
SUBSCRIBER_STATUS,
P_ID
from C_S_FORWARD_INFO
where '07052620' LIKE SUBSCRIBER_NUM || '%' and
SCP_VER = 1 and
LENGTH(SUBSCRIBER_NUM) = (select max(length(SUBSCRIBER_NUM)) from C_S_FORWARD_INFO
where '07052620' like SUBSCRIBER_NUM || '%' and SCP_VER = 1)
offset 0 rows fetch next 1 rows only;
如果这是您的查询:
select SUBSCRIBER_NUM, SUBSCRIBER_STATUS, P_ID
from C_S_FORWARD_INFO
where '07052620' LIKE SUBSCRIBER_NUM || '%' and
SCP_VER = 1;
那么,我建议改写为:
select SUBSCRIBER_NUM, SUBSCRIBER_STATUS, P_ID
from C_S_FORWARD_INFO
where SUBSCRIBER_NUM IN ('0', '07', '070', '0705', '07052', '070526', '0705262', '07052620') and
SCP_VER = 1
order by length(subscriber_num) desc
fetch first 1 row only;
然后,在(SCP_VER, SUBSCRIBER_NUM)
上添加索引:
create index idx_forwardinfo_2 on c_s_forward_info(SCP_VER, SUBSCRIBER_NUM);
这可能有助于也可能不会帮助您解决性能问题:
WITH pred_vals AS (SELECT SUBSTR('07052620', 1, LENGTH('07052620') + 1 - LEVEL) str,
LEVEL priority
FROM dual
CONNECT BY LEVEL <= LENGTH('07052620')),
main_join AS (SELECT fi.subscriber_num,
fi.subscriber_status,
fi.p_id
row_number() OVER (ORDER BY pv.priority) rn
FROM c_s_forward_info fi
INNER JOIN pred_vals pv ON (fi.subscriber_num = pv.str)
WHERE scp_ver = 1)
SELECT subscriber_num,
subscriber_status,
p_id
FROM main_join
WHERE rn = 1;
我建议您在以下列的 c_s_forward_info 上建立多列索引:(subscriber_num、scp_ver、subscriber_status、p_id)
这应该有望允许查询 运行 仅针对索引。
查询的工作原理是首先将您传递的字符串分解为您要再次匹配的各种组合(这是字符串中的前 N 个字符,其中 N 是介于 1 和传入的字符串)。
一旦我们有了这些要匹配的字符串,我们就可以将它们直接连接到订阅者列,这将允许优化器在认为这样做更有效的情况下使用索引。
然后,我们可以计算 row_number()(如果要显示与最高优先级连接的行匹配的所有行,则可以计算 dense_rank())然后 select 顶行。
我正在对 200 万条记录执行以下查询 (oracle11g)。大约需要 2.2 秒。
我的查询:
select SUBSCRIBER_NUM, SUBSCRIBER_STATUS, P_ID
from C_S_FORWARD_INFO
where '07052620' LIKE SUBSCRIBER_NUM || '%'
and SCP_VER = 1
下面是我的table。
CREATE TABLE C_S_FORWARD_INFO
(
SUBSCRIBER_NUM VARCHAR2(30 BYTE) NOT NULL,
P_ID NUMBER,
SUBSCRIBER_STATUS NUMBER(1,0) DEFAULT 0 NOT NULL,
ACCOUNT_NUMBER INTEGER NOT NULL,
MAJOR_VERSION_ID NUMBER(10,0) DEFAULT 1 NOT NULL,
MINOR_VERSION_ID NUMBER(10,0) DEFAULT 1 NOT NULL,
SCP_VER NUMBER(1,0) DEFAULT 0 CHECK (SCP_VER IN (0,1)),
);
ALTER TABLE C_S_FORWARD_INFO ADD CONSTRAINT C_S_FORWARD_INFO_PK
PRIMARY KEY (SUBSCRIBER_NUM,ACCOUNT_NUMBER,MAJOR_VERSION_ID, MINOR_VERSION_ID);
DB 记录(例如,实际有 200 万条记录)
Row 1 => 07052620,1,1,10, 1, 1, 1;
Row 2 => 0705262,2,1,10, 1, 1, 1;
Row 3 => 070526,3,1,10, 1, 1, 1;
Row 4 => 070526200001,4, 1,10, 1, 1, 1;
Row 5 => 07052,5,1,10, 1, 1, 1;
......
预期结果:第 1 行(我通过上面的查询得到这个,逻辑:从最长的匹配开始直到 '07052620')
如何优化上述查询。或者编写任何其他逻辑以在 .2
秒内获得预期结果。在我的查询中,“07052620”是作为存储过程输入的动态数字。
20/11 - 更新:
我在下面试过(VAR_CALLING_NUM = 07052620):
while var1<=len LOOP
temp1 := SUBSTR(VAR_CALLING_NUM, 1, var1);
temp1 := concat('''',temp1);
temp1 := concat(temp1,'''');
temp6 := temp6 || temp1 || ',' ;
var1:=var1+1;
END LOOP;
temp6 := SUBSTR(temp6, 1,length(temp6)-1);
select SUBSCRIBER_NUM, SUBSCRIBER_STATUS, P_ID from C_S_FORWARD_INFO where SUBSCRIBER_NUM IN ( temp6 ) and SCP_VER = 1 order by length(subscriber_num) desc;
但这并没有给我结果。看起来查询没有动态获取 temp6。请帮助
以下查询向您的原始尝试添加了一个子查询,它将结果限制为匹配但也限制为最长匹配的结果。它 return 是任意一条记录,但如果您想 return 最长匹配,可以轻松修改它。
select SUBSCRIBER_NUM,
SUBSCRIBER_STATUS,
P_ID
from C_S_FORWARD_INFO
where '07052620' LIKE SUBSCRIBER_NUM || '%' and
SCP_VER = 1 and
LENGTH(SUBSCRIBER_NUM) = (select max(length(SUBSCRIBER_NUM)) from C_S_FORWARD_INFO
where '07052620' like SUBSCRIBER_NUM || '%' and SCP_VER = 1)
offset 0 rows fetch next 1 rows only;
如果这是您的查询:
select SUBSCRIBER_NUM, SUBSCRIBER_STATUS, P_ID
from C_S_FORWARD_INFO
where '07052620' LIKE SUBSCRIBER_NUM || '%' and
SCP_VER = 1;
那么,我建议改写为:
select SUBSCRIBER_NUM, SUBSCRIBER_STATUS, P_ID
from C_S_FORWARD_INFO
where SUBSCRIBER_NUM IN ('0', '07', '070', '0705', '07052', '070526', '0705262', '07052620') and
SCP_VER = 1
order by length(subscriber_num) desc
fetch first 1 row only;
然后,在(SCP_VER, SUBSCRIBER_NUM)
上添加索引:
create index idx_forwardinfo_2 on c_s_forward_info(SCP_VER, SUBSCRIBER_NUM);
这可能有助于也可能不会帮助您解决性能问题:
WITH pred_vals AS (SELECT SUBSTR('07052620', 1, LENGTH('07052620') + 1 - LEVEL) str,
LEVEL priority
FROM dual
CONNECT BY LEVEL <= LENGTH('07052620')),
main_join AS (SELECT fi.subscriber_num,
fi.subscriber_status,
fi.p_id
row_number() OVER (ORDER BY pv.priority) rn
FROM c_s_forward_info fi
INNER JOIN pred_vals pv ON (fi.subscriber_num = pv.str)
WHERE scp_ver = 1)
SELECT subscriber_num,
subscriber_status,
p_id
FROM main_join
WHERE rn = 1;
我建议您在以下列的 c_s_forward_info 上建立多列索引:(subscriber_num、scp_ver、subscriber_status、p_id)
这应该有望允许查询 运行 仅针对索引。
查询的工作原理是首先将您传递的字符串分解为您要再次匹配的各种组合(这是字符串中的前 N 个字符,其中 N 是介于 1 和传入的字符串)。
一旦我们有了这些要匹配的字符串,我们就可以将它们直接连接到订阅者列,这将允许优化器在认为这样做更有效的情况下使用索引。
然后,我们可以计算 row_number()(如果要显示与最高优先级连接的行匹配的所有行,则可以计算 dense_rank())然后 select 顶行。