sql.DB aws-lambda 连接过多

sql.DB on aws-lambda too many connection

我在 Golang 中的理解:the DB handle is meant to be long-lived and shared between many goroutines

但是当我将 Golang 与 AWS lambda 结合使用时,情况就完全不同了,因为 lambda 会在函数完成时停止该函数。

我在 Lambda Invoke 函数中使用:defer db.Close() 但它不受影响。在 MySQL,它仍然保持 Sleep query 的连接。结果,它在 MySQL 上导致 too many connections

目前,我必须将 MySQL 中的 wait_timeout 设置为小数。但在我看来,这不是最好的解决方案。

在 Lambda 中使用 Go SQL 驱动程序时,有什么方法可以关闭连接吗?

谢谢,

我们需要解决两个问题

  • 正确管理 lambda 调用之间的状态
  • 配置连接池

正确管理状态

让我们了解一下 AWS 是如何管理容器的。来自 AWS docs:

After a Lambda function is executed, AWS Lambda maintains the execution context for some time in anticipation of another Lambda function invocation. In effect, the service freezes the execution context after a Lambda function completes, and thaws the context for reuse, if AWS Lambda chooses to reuse the context when the Lambda function is invoked again. This execution context reuse approach has the following implications:

  • Any declarations in your Lambda function code (outside the handler code, see Programming Model) remains initialized, providing additional optimization when the function is invoked again. For example, if your Lambda function establishes a database connection, instead of reestablishing the connection, the original connection is used in subsequent invocations. We suggest adding logic in your code to check if a connection exists before creating one.

  • Each execution context provides 500MB of additional disk space in the /tmp directory. The directory content remains when the execution context is frozen, providing transient cache that can be used for multiple invocations. You can add extra code to check if the cache has the data that you stored. For information on deployment limits, see AWS Lambda Limits.

  • Background processes or callbacks initiated by your Lambda function that did not complete when the function ended resume if AWS Lambda chooses to reuse the execution context. You should make sure any background processes or callbacks (in case of Node.js) in your code are complete before the code exits.

第一个要点表示状态在两次执行之间保持不变。让我们看看实际效果:

let counter = 0

module.exports.handler = (event, context, callback) => {
  counter++
  callback(null, { count: counter })
}

如果您部署它并多次连续调用,您将看到计数器将在两次调用之间递增。

现在您知道了 - 您不应该调用 defer db.Close(),而应该 重用 数据库实例。您可以通过简单地使 db 成为包级别变量来做到这一点。

首先,创建一个将导出 Open 函数的数据库包:

package database

import (
    "fmt"
    "os"

    _ "github.com/go-sql-driver/mysql"
    "github.com/jinzhu/gorm"
)

var (
    host = os.Getenv("DB_HOST")
    port = os.Getenv("DB_PORT")
    user = os.Getenv("DB_USER")
    name = os.Getenv("DB_NAME")
    pass = os.Getenv("DB_PASS")
)

func Open() (db *gorm.DB) {
    args := fmt.Sprintf("%s:%s@tcp(%s:%s)/%s?parseTime=true", user, pass, host, port, name)
    // Initialize a new db connection.
    db, err := gorm.Open("mysql", args)
    if err != nil {
        panic(err)
    }
    return
}

然后在您的 handler.go 文件中使用它:

package main

import (
    "context"

    "github.com/aws/aws-lambda-go/events"
    "github.com/aws/aws-lambda-go/lambda"
    "github.com/jinzhu/gorm"
    "github.com/<username>/<name-of-lib>/database"
)

var db *gorm.DB

func init() {
    db = database.Open()
}

func Handler() (events.APIGatewayProxyResponse, error) {
    // You can use db here.
    return events.APIGatewayProxyResponse{
        StatusCode: 201,
    }, nil
}

func main() {
    lambda.Start(Handler)
}

OBS:不要忘记用正确的路径替换github.com/<username>/<name-of-lib>/database

现在,您可能仍会看到 too many connections 错误。如果发生这种情况,您将需要一个连接池。

配置连接池

来自 Wikipedia:

In software engineering, a connection pool is a cache of database connections maintained so that the connections can be reused when future requests to the database are required. Connection pools are used to enhance the performance of executing commands on a database.

您将需要一个连接池,允许的连接数必须等于并行 lambda 的数量运行,您有两个选择:

  • MySQL 代理

MySQL Proxy is a simple program that sits between your client and MySQL server(s) and that can monitor, analyze or transform their communication. Its flexibility allows for a wide variety of uses, including load balancing, failover, query analysis, query filtering and modification, and many more.

  • AWS 极光:

Amazon Aurora Serverless is an on-demand, auto-scaling configuration for Amazon Aurora (MySQL-compatible edition), where the database will automatically start up, shut down, and scale capacity up or down based on your application's needs. It enables you to run your database in the cloud without managing any database instances. It's a simple, cost-effective option for infrequent, intermittent, or unpredictable workloads.

无论您如何选择,互联网上都有大量关于如何配置两者的教程。