mysql 大查询优化
mysql big query optimization
我需要优化以下最多需要 10 分钟的查询 运行。
执行解释似乎是 运行ning 在 "table_3" table 的所有 350815 行和所有其他行的 1。
以正确方式放置索引的一般规则?我应该考虑使用多维索引吗?我应该首先在 JOINS、WHERE 或 GROUP BY 上的什么地方使用它们,如果我没记错的话应该有一个层次结构可以遵循。此外,如果我对所有 table 有 1 行,但有 1 行(在解释 table 的行列中)我如何优化通常我的优化包括结束所有列只有一行但一.
所有 table 平均从 100k 到 1000k+ 行。
CREATE TABLE datab1.sku_performance
SELECT
table1.sku,
CONCAT(table1.sku,' ',table1.fk_container ) as sku_container,
table1.price as price,
SUM( CASE WHEN ( table1.fk_table1_status = 82
OR table1.fk_table1_status = 119
OR table1.fk_table1_status = 124
OR table1.fk_table1_status = 141
OR table1.fk_table1_status = 131) THEN 1 ELSE 0 END)
/ COUNT( DISTINCT id_catalog_school_class) as qty_returned,
SUM( CASE WHEN ( table1.fk_table1_status In (23,13,44,65,6,75,8,171,12,166))
THEN 1 ELSE 0 END)
/ COUNT( DISTINCT id_catalog_school_class) as qt,
container.id_container as container_id,
container.idden as container_idden,
container.delivery_badge,
catalog_school.id_catalog_school,
LEFT(catalog_school.flight_fair,2) as departing_country,
catalog_school.weight,
catalog_school.flight_type,
catalog_school.price,
table_3.id_table_3,
table_3.fk_catalog_brand,
MAX( LEFT( table_3.note,3 )) AS supplier,
GROUP_CONCAT( product_number, ' by ',FORMAT(catalog_school_class.quantity,0)
ORDER BY product_number ASC SEPARATOR ' + ') as supplier_prod,
Sum( distinct( catalog_school_class.purch_pri * catalog_school_class.quantity)) AS final_purch_pri,
catalog_groupp.idden as supplier_idden,
catalog_category_details.id_catalog_category,
catalog_category_details.cat1 as product_cat1,
catalog_category_details.cat2 as product_cat2,
COUNT( distinct catalog_school_class.id_catalog_school_class) as setinfo,
datab1.pageviewgrouped.pv as page_views,
Sum(distinct(catalog_school_class.purch_pri * catalog_school_class.quantity)) AS purch_pri,
container_has_table_3.position,
max( table1.created_at ) as last_order_date
FROM
table1
LEFT JOIN container
ON table1.fk_container = container.id_container
LEFT JOIN catalog_school
ON table1.sku = catalog_school.sku
LEFT JOIN table_3
ON catalog_school.fk_table_3 = table_3.id_table_3
LEFT JOIN container_has_table_3
ON table_3.id_table_3 = container_has_table_3.fk_table_3
LEFT JOIN datab1.pageviewgrouped
on table_3.id_table_3 = datab1.pageviewgrouped.url
LEFT JOIN datab1.catalog_category_details
ON datab1.catalog_category_details.id_catalog_category = table_3_has_catalog_minority.fk_catalog_category
LEFT JOIN catalog_groupp
ON table_3.fk_catalog_groupp = catalog_groupp.id_catalog_groupp
LEFT JOIN table_3_has_catalog_minority
ON table_3.id_table_3 = table_3_has_catalog_minority.fk_table_3
LEFT JOIN catalog_school_class
ON catalog_school.id_catalog_school = catalog_school_class.fk_catalog_school
WHERE
table_3.status_ok = 1
AND catalog_school.status = 'active'
AND table_3_has_catalog_minority.is_primary = '1'
GROUP BY
table1.sku,
table1.fk_container;
每 table 的行数:
.table1 960096 to 1.3mn rows
.container 9275 to 13000 rows
.catalog_school 709970 to 1 mn rows
.table_3 709970 to 1 mn rows
.container_has_table_3 709970 to 1 mn rows
.pageviewgrouped 500000 rows
.catalog_school_class 709970 to 1 mn rows
.catalog_groupp 3000 rows
.table_3_has_catalog_minority 709970 to 1 mn rows
.catalog_category_details 659 rows
太多内容无法放入单个评论中,所以我将在此处添加并稍后根据需要进行调整...您到处都有 LEFT JOIN,但是您的 WHERE 子句是 Table_3 中的特定限定字段, Catalog_School 和 Table_3_has_catalog_minority。默认情况下,这会将它们更改为 INNER JOIN。
关于你的 where 子句
WHERE
table_3.status_ok = 1
AND catalog_school.status = 'active'
AND table_3_has_catalog_minority.is_primary = '1'
根据这些标准,哪个 table / 列的结果最小。例如:Table_3.Status_ok = 1 可能有 500k 条记录,但 table_3_has_catalog_minority.is_primary 可能只有 65k,而 catalog_school.status = 'active' 可能有 430k。
此外,您的一些专栏不符合它们来自的 table 的要求。能否请您确认...例如 "id_catalog_school_class" 和 "product_number"
有时,更改 table 的顺序,充分了解数据的构成,并在 MySQL 中添加 "STRAIGHT_JOIN" 关键字可以提高性能。这是我过去使用政府合同和拨款数据库时遇到的事情,该数据库有 20 多万条记录,并加入了大约 15 次以上的查找 tables。从挂起服务器到在不到 2 小时内完成查询。考虑到我正在处理的数据量,那实际上是个好时机。
在对这个东西进行一些剖析之后,我为了可读性进行了更多的重组,为 table 引用添加了别名,并更改了查询的顺序并有一些建议的索引。为了帮助查询,我尝试将 Catalog_School table 移动到第一个位置并添加 STRAIGHT_JOIN。该索引首先基于 STATUS 来匹配 WHERE 子句,然后我将 SKU 包括在内,因为它是 GROUP BY 的第一个元素,然后其他列用于连接到后续的 tables。通过在索引中包含这些列,它可以限定连接而无需转到原始数据。
通过将组依据更改为 Catalog_School.SKU 而不是 table_1.SKU,catalog_school 中的索引可用于帮助优化它。它与从 catalog_school.sku = table_1.sku 加入后的值相同。我还为 table_1 和 table_3 添加了索引引用,它们是建议——同样,先发制人地限定连接,而无需转到 table 的原始数据页。
我有兴趣了解您的数据的最终性能(更好或更差)。
TABLE INDEX ON...
catalog_school ( status, sku, fk_table_3, id_catalog_school )
table_1 ( sku, fk_container )
table_3 ( id_table_3, status_ok, fk_catalog_groupp )
SELECT STRAIGHT_JOIN
CS.sku,
CONCAT(CS.sku,' ',T1.fk_container ) as sku_container,
T1.price as price,
SUM( CASE WHEN ( T1.fk_table1_status IN ( 82, 119, 124, 141, 131)
THEN 1 ELSE 0 END)
/ COUNT( DISTINCT CSC.id_catalog_school_class) as qty_returned,
SUM( CASE WHEN ( T1.fk_table1_status In (23,13,44,65,6,75,8,171,12,166))
THEN 1 ELSE 0 END)
/ COUNT( DISTINCT CSC.id_catalog_school_class) as qt,
CS.id_catalog_school,
LEFT(CS.flight_fair,2) as departing_country,
CS.weight,
CS.flight_type,
CS.price,
T3.id_table_3,
T3.fk_catalog_brand,
MAX( LEFT( T3.note,3 )) AS supplier,
C.id_container as container_id,
C.idden as container_idden,
C.delivery_badge,
GROUP_CONCAT( product_number, ' by ',FORMAT(CSC.quantity,0)
ORDER BY product_number ASC SEPARATOR ' + ') as supplier_prod,
Sum( distinct( CSC.purch_pri * CSC.quantity)) AS final_purch_pri,
CGP.idden as supplier_idden,
CCD.id_catalog_category,
CCD.cat1 as product_cat1,
CCD.cat2 as product_cat2,
COUNT( distinct CSC.id_catalog_school_class) as setinfo,
PVG.pv as page_views,
Sum(distinct(CSC.purch_pri * CSC.quantity)) AS purch_pri,
CHT3.position,
max( T1.created_at ) as last_order_date
FROM
catalog_school CS
JOIN table1 T1
ON CS.sku = T1.sku
LEFT JOIN container C
ON T1.fk_container = C.id_container
LEFT JOIN catalog_school_class CSC
ON CS.id_catalog_school = CSC.fk_catalog_school
JOIN table_3 T3
ON CS.fk_table_3 = T3.id_table_3
JOIN table_3_has_catalog_minority T3HCM
ON T3.id_table_3 = T3HCM.fk_table_3
LEFT JOIN datab1.catalog_category_details CCD
ON T3HCM.fk_catalog_category = CCD.id_catalog_category
LEFT JOIN container_has_table_3 CHT3
ON T3.id_table_3 = CHT3.fk_table_3
LEFT JOIN datab1.pageviewgrouped PVG
on T3.id_table_3 = PVG.url
LEFT JOIN catalog_groupp CGP
ON T3.fk_catalog_groupp = CGP.id_catalog_groupp
WHERE
CS.status = 'active'
AND T3.status_ok = 1
AND T3HCM.is_primary = '1'
GROUP BY
CS.sku,
T1.fk_container;
我需要优化以下最多需要 10 分钟的查询 运行。 执行解释似乎是 运行ning 在 "table_3" table 的所有 350815 行和所有其他行的 1。 以正确方式放置索引的一般规则?我应该考虑使用多维索引吗?我应该首先在 JOINS、WHERE 或 GROUP BY 上的什么地方使用它们,如果我没记错的话应该有一个层次结构可以遵循。此外,如果我对所有 table 有 1 行,但有 1 行(在解释 table 的行列中)我如何优化通常我的优化包括结束所有列只有一行但一. 所有 table 平均从 100k 到 1000k+ 行。
CREATE TABLE datab1.sku_performance
SELECT
table1.sku,
CONCAT(table1.sku,' ',table1.fk_container ) as sku_container,
table1.price as price,
SUM( CASE WHEN ( table1.fk_table1_status = 82
OR table1.fk_table1_status = 119
OR table1.fk_table1_status = 124
OR table1.fk_table1_status = 141
OR table1.fk_table1_status = 131) THEN 1 ELSE 0 END)
/ COUNT( DISTINCT id_catalog_school_class) as qty_returned,
SUM( CASE WHEN ( table1.fk_table1_status In (23,13,44,65,6,75,8,171,12,166))
THEN 1 ELSE 0 END)
/ COUNT( DISTINCT id_catalog_school_class) as qt,
container.id_container as container_id,
container.idden as container_idden,
container.delivery_badge,
catalog_school.id_catalog_school,
LEFT(catalog_school.flight_fair,2) as departing_country,
catalog_school.weight,
catalog_school.flight_type,
catalog_school.price,
table_3.id_table_3,
table_3.fk_catalog_brand,
MAX( LEFT( table_3.note,3 )) AS supplier,
GROUP_CONCAT( product_number, ' by ',FORMAT(catalog_school_class.quantity,0)
ORDER BY product_number ASC SEPARATOR ' + ') as supplier_prod,
Sum( distinct( catalog_school_class.purch_pri * catalog_school_class.quantity)) AS final_purch_pri,
catalog_groupp.idden as supplier_idden,
catalog_category_details.id_catalog_category,
catalog_category_details.cat1 as product_cat1,
catalog_category_details.cat2 as product_cat2,
COUNT( distinct catalog_school_class.id_catalog_school_class) as setinfo,
datab1.pageviewgrouped.pv as page_views,
Sum(distinct(catalog_school_class.purch_pri * catalog_school_class.quantity)) AS purch_pri,
container_has_table_3.position,
max( table1.created_at ) as last_order_date
FROM
table1
LEFT JOIN container
ON table1.fk_container = container.id_container
LEFT JOIN catalog_school
ON table1.sku = catalog_school.sku
LEFT JOIN table_3
ON catalog_school.fk_table_3 = table_3.id_table_3
LEFT JOIN container_has_table_3
ON table_3.id_table_3 = container_has_table_3.fk_table_3
LEFT JOIN datab1.pageviewgrouped
on table_3.id_table_3 = datab1.pageviewgrouped.url
LEFT JOIN datab1.catalog_category_details
ON datab1.catalog_category_details.id_catalog_category = table_3_has_catalog_minority.fk_catalog_category
LEFT JOIN catalog_groupp
ON table_3.fk_catalog_groupp = catalog_groupp.id_catalog_groupp
LEFT JOIN table_3_has_catalog_minority
ON table_3.id_table_3 = table_3_has_catalog_minority.fk_table_3
LEFT JOIN catalog_school_class
ON catalog_school.id_catalog_school = catalog_school_class.fk_catalog_school
WHERE
table_3.status_ok = 1
AND catalog_school.status = 'active'
AND table_3_has_catalog_minority.is_primary = '1'
GROUP BY
table1.sku,
table1.fk_container;
每 table 的行数:
.table1 960096 to 1.3mn rows
.container 9275 to 13000 rows
.catalog_school 709970 to 1 mn rows
.table_3 709970 to 1 mn rows
.container_has_table_3 709970 to 1 mn rows
.pageviewgrouped 500000 rows
.catalog_school_class 709970 to 1 mn rows
.catalog_groupp 3000 rows
.table_3_has_catalog_minority 709970 to 1 mn rows
.catalog_category_details 659 rows
太多内容无法放入单个评论中,所以我将在此处添加并稍后根据需要进行调整...您到处都有 LEFT JOIN,但是您的 WHERE 子句是 Table_3 中的特定限定字段, Catalog_School 和 Table_3_has_catalog_minority。默认情况下,这会将它们更改为 INNER JOIN。
关于你的 where 子句
WHERE
table_3.status_ok = 1
AND catalog_school.status = 'active'
AND table_3_has_catalog_minority.is_primary = '1'
根据这些标准,哪个 table / 列的结果最小。例如:Table_3.Status_ok = 1 可能有 500k 条记录,但 table_3_has_catalog_minority.is_primary 可能只有 65k,而 catalog_school.status = 'active' 可能有 430k。
此外,您的一些专栏不符合它们来自的 table 的要求。能否请您确认...例如 "id_catalog_school_class" 和 "product_number"
有时,更改 table 的顺序,充分了解数据的构成,并在 MySQL 中添加 "STRAIGHT_JOIN" 关键字可以提高性能。这是我过去使用政府合同和拨款数据库时遇到的事情,该数据库有 20 多万条记录,并加入了大约 15 次以上的查找 tables。从挂起服务器到在不到 2 小时内完成查询。考虑到我正在处理的数据量,那实际上是个好时机。
在对这个东西进行一些剖析之后,我为了可读性进行了更多的重组,为 table 引用添加了别名,并更改了查询的顺序并有一些建议的索引。为了帮助查询,我尝试将 Catalog_School table 移动到第一个位置并添加 STRAIGHT_JOIN。该索引首先基于 STATUS 来匹配 WHERE 子句,然后我将 SKU 包括在内,因为它是 GROUP BY 的第一个元素,然后其他列用于连接到后续的 tables。通过在索引中包含这些列,它可以限定连接而无需转到原始数据。
通过将组依据更改为 Catalog_School.SKU 而不是 table_1.SKU,catalog_school 中的索引可用于帮助优化它。它与从 catalog_school.sku = table_1.sku 加入后的值相同。我还为 table_1 和 table_3 添加了索引引用,它们是建议——同样,先发制人地限定连接,而无需转到 table 的原始数据页。
我有兴趣了解您的数据的最终性能(更好或更差)。
TABLE INDEX ON...
catalog_school ( status, sku, fk_table_3, id_catalog_school )
table_1 ( sku, fk_container )
table_3 ( id_table_3, status_ok, fk_catalog_groupp )
SELECT STRAIGHT_JOIN
CS.sku,
CONCAT(CS.sku,' ',T1.fk_container ) as sku_container,
T1.price as price,
SUM( CASE WHEN ( T1.fk_table1_status IN ( 82, 119, 124, 141, 131)
THEN 1 ELSE 0 END)
/ COUNT( DISTINCT CSC.id_catalog_school_class) as qty_returned,
SUM( CASE WHEN ( T1.fk_table1_status In (23,13,44,65,6,75,8,171,12,166))
THEN 1 ELSE 0 END)
/ COUNT( DISTINCT CSC.id_catalog_school_class) as qt,
CS.id_catalog_school,
LEFT(CS.flight_fair,2) as departing_country,
CS.weight,
CS.flight_type,
CS.price,
T3.id_table_3,
T3.fk_catalog_brand,
MAX( LEFT( T3.note,3 )) AS supplier,
C.id_container as container_id,
C.idden as container_idden,
C.delivery_badge,
GROUP_CONCAT( product_number, ' by ',FORMAT(CSC.quantity,0)
ORDER BY product_number ASC SEPARATOR ' + ') as supplier_prod,
Sum( distinct( CSC.purch_pri * CSC.quantity)) AS final_purch_pri,
CGP.idden as supplier_idden,
CCD.id_catalog_category,
CCD.cat1 as product_cat1,
CCD.cat2 as product_cat2,
COUNT( distinct CSC.id_catalog_school_class) as setinfo,
PVG.pv as page_views,
Sum(distinct(CSC.purch_pri * CSC.quantity)) AS purch_pri,
CHT3.position,
max( T1.created_at ) as last_order_date
FROM
catalog_school CS
JOIN table1 T1
ON CS.sku = T1.sku
LEFT JOIN container C
ON T1.fk_container = C.id_container
LEFT JOIN catalog_school_class CSC
ON CS.id_catalog_school = CSC.fk_catalog_school
JOIN table_3 T3
ON CS.fk_table_3 = T3.id_table_3
JOIN table_3_has_catalog_minority T3HCM
ON T3.id_table_3 = T3HCM.fk_table_3
LEFT JOIN datab1.catalog_category_details CCD
ON T3HCM.fk_catalog_category = CCD.id_catalog_category
LEFT JOIN container_has_table_3 CHT3
ON T3.id_table_3 = CHT3.fk_table_3
LEFT JOIN datab1.pageviewgrouped PVG
on T3.id_table_3 = PVG.url
LEFT JOIN catalog_groupp CGP
ON T3.fk_catalog_groupp = CGP.id_catalog_groupp
WHERE
CS.status = 'active'
AND T3.status_ok = 1
AND T3HCM.is_primary = '1'
GROUP BY
CS.sku,
T1.fk_container;