为什么我的 lambda 无法与 elasticache 通信?

Why is my lambda not able to talk to elasticache?

我有一个 Redis ElastiCache 集群,其主节点的 FQDN 格式如下:master.clustername.x.euw1.cache.amazonaws.com。我还有一条 Route53 记录,其中 CNAME 指向该 FQDN。

我在与集群相同的 VPC 中有一个 .net 核心 lambda,可以通过安全组访问集群。 lambda 使用 Stack Overflow (Github repo here for reference) 开发的 Redis 库与集群对话。

如果我给 lambda 主机名 Redis 集群的 FQDN(以 master 开头的那个)我可以连接、保存数据并读取它。

如果我给 lambda CNAME(当我从本地机器 ping 它时,CNAME 提供与 FQDN 相同的 IP 地址,而且如果我在 lambda 中使用 Dns.GetHostEntry)它不会连接,我收到以下错误消息:

One or more errors occurred. (It was not possible to connect to the redis server(s); to create a disconnected multiplexer, disable AbortOnConnectFail. SocketFailure on PING): AggregateException
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at lambda_method(Closure , Stream , Stream , LambdaContextInternal )

at StackExchange.Redis.ConnectionMultiplexer.ConnectImpl(Func`1 multiplexerFactory, TextWriter log) in c:\code\StackExchange.Redis\StackExchange.Redis\StackExchange\Redis\ConnectionMultiplexer.cs:line 890
at lambda.Redis.RedisClientBuilder.Build(String redisHost, String redisPort, Int32 redisDbId) in C:\BuildAgent\workd24911506461d0\src\Lambda\Redis\RedisClientBuilder.cs:line 9
at lambda.Ioc.ServiceBuilder.GetRedisClient() in C:\BuildAgent\workd24911506461d0\src\Lambda\IoC\ServiceBuilder.cs:line 18
at lambda.Ioc.ServiceBuilder.GetServices() in C:\BuildAgent\workd24911506461d0\src\Lambda\IoC\ServiceBuilder.cs:line 11
at Handlers.OrderHandler.Run(SNSEvent request, ILambdaContext context) in C:\BuildAgent\workd24911506461d0\src\Lambda\Handlers\OrderHandler.cs:line 26

有人见过类似的东西吗?

从您的客户端库中隔离问题的可能解决方法 -- 按照 AWS' tutorial 并将您的 Lambda 重写为类似于下面的代码(Python 中的示例)。

from __future__ import print_function
import time
import uuid
import sys
import socket
import elasticache_auto_discovery
from pymemcache.client.hash import HashClient

#elasticache settings
elasticache_config_endpoint = "your-elasticache-cluster-endpoint:port"
nodes = elasticache_auto_discovery.discover(elasticache_config_endpoint)
nodes = map(lambda x: (x[1], int(x[2])), nodes)
memcache_client = HashClient(nodes)

def handler(event, context):
    """
    This function puts into memcache and get from it.
    Memcache is hosted using elasticache
    """

    #Create a random UUID... this will the sample element we add to the cache.
    uuid_inserted = uuid.uuid4().hex
    #Put the UUID to the cache.
    memcache_client.set('uuid', uuid_inserted)
    #Get item (UUID) from the cache.
    uuid_obtained = memcache_client.get('uuid')
    if uuid_obtained.decode("utf-8") == uuid_inserted:
        # this print should go to the CloudWatch Logs and Lambda console.
        print ("Success: Fetched value %s from memcache" %(uuid_inserted))
    else:
        raise Exception("Value is not the same as we put :(. Expected %s got %s" %(uuid_inserted, uuid_obtained))

    return "Fetched value from memcache: " + uuid_obtained.decode("utf-8")

参考:https://docs.aws.amazon.com/lambda/latest/dg/vpc-ec-deployment-pkg.html

事实证明,因为我在 elasticache 集群上使用了 SSL 证书,并且 SSL 证书绑定到 master. 端点,而我试图连接到 CNAME,所以证书验证失败。

所以我最终在代码中查询 Route53 记录以获取 master 端点并且它有效。