MySQL (RDS) 查询在我的脚本完成很久之后就在我的进程列表中 运行
MySQL (RDS) queries are running in my process list long after my scripts have finished
首先,我知道我 运行查询的方式存在根本性问题。我想忽略这个问题的根本原因,在这个问题中只问以下问题:
我在 Lambda(Larval Vapor)上有一个 PHP 脚本 运行ning。 lambda 函数在 运行ning 一批插入、更新和删除查询时超时 60 秒,但是在 MySQL 进程列表中我看到这些查询 运行ning 很长时间之后(几十分钟)。
如果 PHP 脚本超时,这些查询来自哪里?我的猜测是它们存储在 QUERY CACHE 中,但我只是猜测。我很想知道更多关于它是如何工作的。
我目前遇到的问题是 php 脚本超时,因此我的应用程序认为它失败了。在稍后的函数中,我们再次 运行 相同的查询,但当然它们仍在 RDS 实例上 运行ning。因此,我们最终只是将相同的查询层叠在一起 - PHP 不知道 SQL 服务器有任何查询 运行ning...
我猜测与 MySQL 服务器的连接由于 php 超时而未正确关闭,数据库服务器现在不应该中断 运行 查询的执行。
The SHOW PROCESSLIST does not show the query cache. The query cache doesn't store queries, it stores query results. I.e., data.
What you see in SHOW PROCESSLIST are 运行ning queries. When the client connects, the MySQL Server creates a thread in the server. That's what is executing the each query for that connection. When the client closes, the thread in the server is terminated.
If your Lambda script has terminated abnormally, for example by the AWS time limit, the MySQL server may not realize immediately that the client has gone away, so the server allows the query to keep 运行ning until it's finished. But there will be no client to fetch the results. Eventually the server will time out that thread and terminate it.
If you use Lambdas, you have to take special care to make sure your queries are well-optimized and won't 运行 for more than the Lambda execution time limit, which if I remember is 5 minutes.
If you have a query that 运行s longer than 5 minutes, it's not a good candidate for 运行ning in a Lambda.
Re your comment:
It's quite easy for an SQL query to 运行 for 30 minutes or more, if it is scanning a large amount of data in an inefficient manner. Sometimes SQL queries are so inefficient that it's difficult to make them 运行 in less than 30 minutes!
MySQL does have a feature to terminate a query after a time limit, but I think that's the wrong solution. If you force the server to limit the time for a query, it doesn't mean you get the results in less time, it means the query is cancelled, and you don't get any results. Likewise if it's an UPDATE, you aren't waiting for a set of rows as the result, but it will still be cancelled. In other words, the UPDATE you wanted won't happen.
The better solution is to optimize the query, so it normally 运行s quickly enough to give you the results before you would want it to time out.
首先,我知道我 运行查询的方式存在根本性问题。我想忽略这个问题的根本原因,在这个问题中只问以下问题:
我在 Lambda(Larval Vapor)上有一个 PHP 脚本 运行ning。 lambda 函数在 运行ning 一批插入、更新和删除查询时超时 60 秒,但是在 MySQL 进程列表中我看到这些查询 运行ning 很长时间之后(几十分钟)。
如果 PHP 脚本超时,这些查询来自哪里?我的猜测是它们存储在 QUERY CACHE 中,但我只是猜测。我很想知道更多关于它是如何工作的。
我目前遇到的问题是 php 脚本超时,因此我的应用程序认为它失败了。在稍后的函数中,我们再次 运行 相同的查询,但当然它们仍在 RDS 实例上 运行ning。因此,我们最终只是将相同的查询层叠在一起 - PHP 不知道 SQL 服务器有任何查询 运行ning...
我猜测与 MySQL 服务器的连接由于 php 超时而未正确关闭,数据库服务器现在不应该中断 运行 查询的执行。
The SHOW PROCESSLIST does not show the query cache. The query cache doesn't store queries, it stores query results. I.e., data.
What you see in SHOW PROCESSLIST are 运行ning queries. When the client connects, the MySQL Server creates a thread in the server. That's what is executing the each query for that connection. When the client closes, the thread in the server is terminated.
If your Lambda script has terminated abnormally, for example by the AWS time limit, the MySQL server may not realize immediately that the client has gone away, so the server allows the query to keep 运行ning until it's finished. But there will be no client to fetch the results. Eventually the server will time out that thread and terminate it.
If you use Lambdas, you have to take special care to make sure your queries are well-optimized and won't 运行 for more than the Lambda execution time limit, which if I remember is 5 minutes.
If you have a query that 运行s longer than 5 minutes, it's not a good candidate for 运行ning in a Lambda.
Re your comment:
It's quite easy for an SQL query to 运行 for 30 minutes or more, if it is scanning a large amount of data in an inefficient manner. Sometimes SQL queries are so inefficient that it's difficult to make them 运行 in less than 30 minutes!
MySQL does have a feature to terminate a query after a time limit, but I think that's the wrong solution. If you force the server to limit the time for a query, it doesn't mean you get the results in less time, it means the query is cancelled, and you don't get any results. Likewise if it's an UPDATE, you aren't waiting for a set of rows as the result, but it will still be cancelled. In other words, the UPDATE you wanted won't happen.
The better solution is to optimize the query, so it normally 运行s quickly enough to give you the results before you would want it to time out.