优化复杂的 mysql 以减少查询时间
Optimizing a complex mysql to reduce query time
我有一个非常复杂的查询,我已经优化了很多,但是我找不到更好的方法来编写更优化的查询并减少查询时间。让我分享详细信息,以便您更好地理解它。
下面是我的查询。
SELECT a.msisdn, GROUP_CONCAT( COALESCE(a.answer, 'skip') order by a.question_id separator ',') as answer,
(SELECT GROUP_CONCAT( COALESCE(answer, 'skip') order by question_id separator ',') as answer FROM `campaign_survey_responses`
WHERE campaign_id = 11559 and question_id=14751 and msisdn=a.msisdn) as a1,
(SELECT GROUP_CONCAT( COALESCE(s.answer, 'skip') order by s.question_id separator ',') as answer
FROM `campaign_survey_responses` s left join campaign_survey_questions q on q.id = s.question_id
WHERE s.campaign_id = 11559 and q.parent_id=5128 and q.sort_order = 0 and s.msisdn=a.msisdn) as sur1,
(SELECT GROUP_CONCAT( COALESCE(answer, 'skip') order by question_id separator ',') as answer FROM `campaign_survey_responses`
WHERE campaign_id = 11559 and question_id=14768 and msisdn=a.msisdn) as a2,
(SELECT GROUP_CONCAT( COALESCE(s.answer, 'skip') order by s.question_id separator ',') as answer
FROM `campaign_survey_responses` s left join campaign_survey_questions q on q.id = s.question_id
WHERE s.campaign_id = 11559 and q.parent_id=5108 and q.sort_order = 0 and s.msisdn=a.msisdn) as sur2,
(SELECT GROUP_CONCAT( COALESCE(answer, 'skip') order by question_id separator ',') as answer FROM `campaign_survey_responses`
WHERE campaign_id = 11559 and question_id=14785 and msisdn=a.msisdn) as a3,
(SELECT GROUP_CONCAT( COALESCE(s.answer, 'skip') order by s.question_id separator ',') as answer
FROM `campaign_survey_responses` s left join campaign_survey_questions q on q.id = s.question_id
WHERE s.campaign_id = 11559 and q.parent_id=5148 and q.sort_order = 0 and s.msisdn=a.msisdn) as sur3
FROM `campaign_survey_responses` a
WHERE a.campaign_id = 11559 and a.question_id=14750 group by msisdn limit 500;
上面的查询结合了 100 万行,只有大约 50,000 行,现在唯一的目的是导出为 csv。但是上面的查询运行起来太耗时了
我已经从该查询应用了 500 行的限制,500 行限制大约需要 78.844856977463 秒,但是 50K 行呢?它将关闭服务器。
任何优化查询的更好方法,我正在使用 Mysql 5.7,谢谢
它returns数据如下,限制500,会有500条记录,下面仅以2条记录为例。
0 =>
array (size=8)
'msisdn' => string '3003932957' (length=10)
'answer' => string '1' (length=1)
'answer1' => string '4' (length=1)
'survey1' => null
'answer2' => null
'survey2' => null
'answer3' => null
'survey3' => null
1 =>
array (size=8)
'msisdn' => string '3013555354' (length=10)
'answer' => string '1' (length=1)
'answer1' => string '3' (length=1)
'survey1' => string '2,1,1' (length=5)
'answer2' => null
'survey2' => null
'answer3' => null
'survey3' => null
2 =>
Table结构:
CREATE TABLE `campaign_survey_responses` (
`id` int unsigned NOT NULL AUTO_INCREMENT,
`campaign_id` int unsigned NOT NULL,
`question_id` int unsigned NOT NULL,
`answer` varchar(20) DEFAULT NULL,
`msisdn` int unsigned NOT NULL,
`campaign_date` date DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `campaign_id` (`campaign_id`),
KEY `question_id` (`question_id`)
) ENGINE=MyISAM AUTO_INCREMENT=9016122 DEFAULT CHARSET=utf8
添加了 Table 结构,基本上是系统生成的调用,基于用户输入,它得到像 1、2、3 等的响应被添加到上面 table。例如,它是一种多层次调查。有一个家长问题,有3个选项,如果用户选择选项2,系统会播放与选项2相关的问题。在该问题之后,如果用户选择任何选项,将出现一系列与选项2相对应的10个问题。
与选择 1 和 3 类似。
Parent Question ( it has 3 sub questions )
Each 3 of subquestions has 10 questions each.
如果用户选择子题1,系统会播放与子题1相关的10道题,不会播放其他选择2,3的相关题。这取决于用户,他想玩哪个问题。
SELECT
中的相关子查询可真是拖后腿了。你有很多这样的。如果这是不可避免的,我会将其应用于结果的子集而不是整个结果。即我会先创建内联视图
SELECT
-- your correlated subqueries as fields here
FROM
( -- start inline view
SELECT -- your columns
FROM `campaign_survey_responses` a
WHERE a.campaign_id = 11559 and a.question_id=14750
GROUP BY msisdn
LIMIT 500;
) T1
这样关联子查询只涉及500条记录
campaign_survey_responses
上的这些复合索引将帮助很多:
INDEX(campaign_id, question_id, msisdn, answer)
INDEX(campaign_id, msisdn, question_id, answer)
(第一个可能会帮助 a
;第二个可能会帮助 s
。)
我有一个非常复杂的查询,我已经优化了很多,但是我找不到更好的方法来编写更优化的查询并减少查询时间。让我分享详细信息,以便您更好地理解它。
下面是我的查询。
SELECT a.msisdn, GROUP_CONCAT( COALESCE(a.answer, 'skip') order by a.question_id separator ',') as answer,
(SELECT GROUP_CONCAT( COALESCE(answer, 'skip') order by question_id separator ',') as answer FROM `campaign_survey_responses`
WHERE campaign_id = 11559 and question_id=14751 and msisdn=a.msisdn) as a1,
(SELECT GROUP_CONCAT( COALESCE(s.answer, 'skip') order by s.question_id separator ',') as answer
FROM `campaign_survey_responses` s left join campaign_survey_questions q on q.id = s.question_id
WHERE s.campaign_id = 11559 and q.parent_id=5128 and q.sort_order = 0 and s.msisdn=a.msisdn) as sur1,
(SELECT GROUP_CONCAT( COALESCE(answer, 'skip') order by question_id separator ',') as answer FROM `campaign_survey_responses`
WHERE campaign_id = 11559 and question_id=14768 and msisdn=a.msisdn) as a2,
(SELECT GROUP_CONCAT( COALESCE(s.answer, 'skip') order by s.question_id separator ',') as answer
FROM `campaign_survey_responses` s left join campaign_survey_questions q on q.id = s.question_id
WHERE s.campaign_id = 11559 and q.parent_id=5108 and q.sort_order = 0 and s.msisdn=a.msisdn) as sur2,
(SELECT GROUP_CONCAT( COALESCE(answer, 'skip') order by question_id separator ',') as answer FROM `campaign_survey_responses`
WHERE campaign_id = 11559 and question_id=14785 and msisdn=a.msisdn) as a3,
(SELECT GROUP_CONCAT( COALESCE(s.answer, 'skip') order by s.question_id separator ',') as answer
FROM `campaign_survey_responses` s left join campaign_survey_questions q on q.id = s.question_id
WHERE s.campaign_id = 11559 and q.parent_id=5148 and q.sort_order = 0 and s.msisdn=a.msisdn) as sur3
FROM `campaign_survey_responses` a
WHERE a.campaign_id = 11559 and a.question_id=14750 group by msisdn limit 500;
上面的查询结合了 100 万行,只有大约 50,000 行,现在唯一的目的是导出为 csv。但是上面的查询运行起来太耗时了
我已经从该查询应用了 500 行的限制,500 行限制大约需要 78.844856977463 秒,但是 50K 行呢?它将关闭服务器。
任何优化查询的更好方法,我正在使用 Mysql 5.7,谢谢
它returns数据如下,限制500,会有500条记录,下面仅以2条记录为例。
0 =>
array (size=8)
'msisdn' => string '3003932957' (length=10)
'answer' => string '1' (length=1)
'answer1' => string '4' (length=1)
'survey1' => null
'answer2' => null
'survey2' => null
'answer3' => null
'survey3' => null
1 =>
array (size=8)
'msisdn' => string '3013555354' (length=10)
'answer' => string '1' (length=1)
'answer1' => string '3' (length=1)
'survey1' => string '2,1,1' (length=5)
'answer2' => null
'survey2' => null
'answer3' => null
'survey3' => null
2 =>
Table结构:
CREATE TABLE `campaign_survey_responses` (
`id` int unsigned NOT NULL AUTO_INCREMENT,
`campaign_id` int unsigned NOT NULL,
`question_id` int unsigned NOT NULL,
`answer` varchar(20) DEFAULT NULL,
`msisdn` int unsigned NOT NULL,
`campaign_date` date DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `campaign_id` (`campaign_id`),
KEY `question_id` (`question_id`)
) ENGINE=MyISAM AUTO_INCREMENT=9016122 DEFAULT CHARSET=utf8
添加了 Table 结构,基本上是系统生成的调用,基于用户输入,它得到像 1、2、3 等的响应被添加到上面 table。例如,它是一种多层次调查。有一个家长问题,有3个选项,如果用户选择选项2,系统会播放与选项2相关的问题。在该问题之后,如果用户选择任何选项,将出现一系列与选项2相对应的10个问题。
与选择 1 和 3 类似。
Parent Question ( it has 3 sub questions )
Each 3 of subquestions has 10 questions each.
如果用户选择子题1,系统会播放与子题1相关的10道题,不会播放其他选择2,3的相关题。这取决于用户,他想玩哪个问题。
SELECT
中的相关子查询可真是拖后腿了。你有很多这样的。如果这是不可避免的,我会将其应用于结果的子集而不是整个结果。即我会先创建内联视图
SELECT
-- your correlated subqueries as fields here
FROM
( -- start inline view
SELECT -- your columns
FROM `campaign_survey_responses` a
WHERE a.campaign_id = 11559 and a.question_id=14750
GROUP BY msisdn
LIMIT 500;
) T1
这样关联子查询只涉及500条记录
campaign_survey_responses
上的这些复合索引将帮助很多:
INDEX(campaign_id, question_id, msisdn, answer)
INDEX(campaign_id, msisdn, question_id, answer)
(第一个可能会帮助 a
;第二个可能会帮助 s
。)