需要对 Oracle 中具有略微不同约束的类似查询执行进行粗略分组

Need to roughly group similar query executions in Oracle that have slightly different constraints

正在评估一些当前数据库计划停用的影响。与最近访问过受影响数据的用户进行单独通信是不可行的,因为数量。

我在想,如果我可以执行某种形式的模糊逻辑查找来按用户对查询进行分组,那么至少我可以识别由于预期约束更改而略有不同的重复查询。尽管远非完美,但这可以帮助表示 运行 定期查询以支持重复出现的业务功能与纯粹的临时查询。

任何人都可以提供一些可以让我开始的想法,或者让我知道是否有任何替代想法可以根据我的上述目标进行研究?

您可以结合使用 UTL_MATCH、DBA_HIST_SQLTEXT 和 DBA_HIST_SQLSTAT 来查找相似的查询执行。如果您没有获得 AWR 许可,或者只对最近的查询感兴趣,您可以使用 GV$SQLSTATS 而不是 DBA_HIST 表。

除了复杂之外,您还需要根据反复试验调整以下查询中的一些文字。目前,它只查看每个用户执行次数最多的前 10 个查询,并且只查找相似度得分大于或等于 60% 的前 5 个最相关的查询。

--Common queries and the top 5 most-closely related queries.
with statements as
(
    --All relevant SQL statements
    select
        sqlstats.parsing_schema_name,
        sqlstats.total_executions,
        sqltext.sql_id,
        --Convert CLOB to VARCHAR for UTL_MATCH.
        --Won't matter, since we're only interseted in fuzzy matches anyway.
        to_char(substr(sqltext.sql_text, 1, 1000)) sql_text,
        sqltext.command_type
    from
    (
        --All queries in AWR.
        select sql_id, sql_text, command_type
        from dba_hist_sqltext
    ) sqltext
    join
    (
        --Statistics for all queries in AWR.
        select sql_id, parsing_schema_name, sum(executions_delta) total_executions
        from dba_hist_sqlstat
        group by sql_id, parsing_schema_name
    ) sqlstats
        on sqltext.sql_id = sqlstats.sql_id
    order by parsing_schema_name, total_executions desc
)
--Top N most similar queries.
select *
from
(
    --Ranked similarity.
    select
        similarity.*,
        row_number() over (partition by sql_id1 order by similarity desc) top_similarity
    from
    (
        --Similarity between SQL statements for the Top N SQL and other SQL run by the same user.
        select
            top_n.parsing_schema_name, top_n.sql_id sql_id1, top_n.sql_text sql_text1, top_n.total_executions,
            statements.sql_id sql_id2, statements.sql_text sql_text2,
            utl_match.edit_distance_similarity(top_n.sql_text, statements.sql_text) similarity
        from
        (
            --Top N most executed queries.
            select *
            from
            (
                --Most executed queries per user.
                select
                    statements.*,
                    row_number () over (partition by parsing_schema_name order by total_executions desc) top_n 
                from statements
                order by parsing_schema_name, total_executions desc
            )
            where top_n <= 10
        ) top_n
        join statements
            on top_n.parsing_schema_name = statements.parsing_schema_name
            and top_n.command_type = statements.command_type
            and top_n.sql_id <> statements.sql_id
        order by top_n.sql_id, similarity desc, statements.sql_id
    ) similarity
) ranked_similarity
where top_similarity <= 5
    and similarity >= 60
order by parsing_schema_name, sql_id1, top_similarity;