使用多个连接优化查询
Optimizing query with multiple joins
老实说,我真的认为这是一件简单的事情。但是,我的 MySQL 查询性能非常糟糕,我不知道为什么。
三个 table:个人(10,000,000 行),雇主(50,000 行),person_achievements(250,000 行)
Person (indexes ix_EmpID)
ID int (primary)
Emp_ID int
Full_Name varchar(250)
Employer (indexes ix_ID (ID) and ix_SectID (Sector_ID) and ix_Name)
ID int (primary)
Sector_ID int
Name varchar (250)
person_achievements (indexes ix_ID (ID) and ix_achievement (achievement))
ID int
Sect_ID
Achievement (varchar 250)
查询:
select p.*
from person p
join employer c
on c.ID = p.Emp_ID
join person_achievements a
on a.Sect_ID = c.Sector_ID
where a.Achievement = 'Employee of the Month'
现在我认为我创建的索引会使这个查询执行得更好。然而,它没有。如果我删除 person_achievements table 的连接并保留 where 子句来选择雇主名称(这也是一个索引列),则生成 25,000 行只需要一秒多一点的时间。
我错过了什么?
编辑:添加了人员和雇主主键
编辑 2:
"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
"1","SIMPLE","a","ref","ix_SectID,ix_Achievement,ix_SectID_Achievement","ix_Achievement","253","const","12654","Using where"
"1","SIMPLE","c","ref","PRIMARY,ix_SectID","ix_SectID","8","db.a.Sect_ID","1",""
"1","SIMPLE","p","ref","Emp_ID","Emp_ID","9","db.c.ID","103","Using where"
这是您的查询:
select p.*
from person p join
employer c
on c.ID = p.Emp_ID join
person_achievements a
on a.Sect_ID = c.Sector_ID
where a.Achievement = 'Employee of the Month';
最优指标为:person_achievements(Achievement, Sect_id)
、employer(sector_id, emp_id)
、person(emp_id)
。
您可以重新安排内连接,这样您就可以将查询写成:
select p.*
from person_achievements a join
employer c
on a.Sect_ID = c.Sector_ID join
person p
on c.ID = p.Emp_ID
where a.Achievement = 'Employee of the Month';
这应该是 MySQL 选择使用上述索引处理查询的方式。
不过,我不明白的是为什么名为 person_achievements
的 table 没有 person_id
的列(在本例中为 Emp_Id
) .数据结构或命名似乎有问题。这种误解可能是问题的核心。如果其中有一个人 ID table,那么您的连接可能会乘以行数,因为它使用了错误的键。
老实说,我真的认为这是一件简单的事情。但是,我的 MySQL 查询性能非常糟糕,我不知道为什么。
三个 table:个人(10,000,000 行),雇主(50,000 行),person_achievements(250,000 行)
Person (indexes ix_EmpID)
ID int (primary)
Emp_ID int
Full_Name varchar(250)
Employer (indexes ix_ID (ID) and ix_SectID (Sector_ID) and ix_Name)
ID int (primary)
Sector_ID int
Name varchar (250)
person_achievements (indexes ix_ID (ID) and ix_achievement (achievement))
ID int
Sect_ID
Achievement (varchar 250)
查询:
select p.*
from person p
join employer c
on c.ID = p.Emp_ID
join person_achievements a
on a.Sect_ID = c.Sector_ID
where a.Achievement = 'Employee of the Month'
现在我认为我创建的索引会使这个查询执行得更好。然而,它没有。如果我删除 person_achievements table 的连接并保留 where 子句来选择雇主名称(这也是一个索引列),则生成 25,000 行只需要一秒多一点的时间。
我错过了什么?
编辑:添加了人员和雇主主键
编辑 2:
"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
"1","SIMPLE","a","ref","ix_SectID,ix_Achievement,ix_SectID_Achievement","ix_Achievement","253","const","12654","Using where"
"1","SIMPLE","c","ref","PRIMARY,ix_SectID","ix_SectID","8","db.a.Sect_ID","1",""
"1","SIMPLE","p","ref","Emp_ID","Emp_ID","9","db.c.ID","103","Using where"
这是您的查询:
select p.*
from person p join
employer c
on c.ID = p.Emp_ID join
person_achievements a
on a.Sect_ID = c.Sector_ID
where a.Achievement = 'Employee of the Month';
最优指标为:person_achievements(Achievement, Sect_id)
、employer(sector_id, emp_id)
、person(emp_id)
。
您可以重新安排内连接,这样您就可以将查询写成:
select p.*
from person_achievements a join
employer c
on a.Sect_ID = c.Sector_ID join
person p
on c.ID = p.Emp_ID
where a.Achievement = 'Employee of the Month';
这应该是 MySQL 选择使用上述索引处理查询的方式。
不过,我不明白的是为什么名为 person_achievements
的 table 没有 person_id
的列(在本例中为 Emp_Id
) .数据结构或命名似乎有问题。这种误解可能是问题的核心。如果其中有一个人 ID table,那么您的连接可能会乘以行数,因为它使用了错误的键。