ElasticBeanstalk 中 Docker 中的 Akka 集群

Akka Cluster in Docker in ElasticBeanstalk

我正在尝试在 ElasticBeanstalk 中的 docker 中设置 Akka 集群。节点必须像这样相互通信:

+-------------------------------------------------------+
| ElasticBeanstalk/ECS                                  |
|                                                       |
| +----------------------+     +----------------------+ |
| |  EC2                 |     |  EC2                 | |
| |                      |     |                      | |
| | +------------------+ |     | +------------------+ | |
| | |  Docker          | |     | |  Docker          | | |
| | |                  | |     | |                  | | |
| | |  +------------+  | |     | |  +------------+  | | |
| | |  |            |  | |     | |  |            |  | | |
| | |  |            +---------->->-->            |  | | |
| | |  |  Akka      |  | |     | |  |  Akka      |  | | |
| | |  |            <--<-<----------+            |  | | |
| | |  |            |  | |     | |  |            |  | | |
| | |  +------------+  | |     | |  +------------+  | | |
| | +------------------+ |     | +------------------+ | |
| +----------------------+     +----------------------+ |
+-------------------------------------------------------+

使用博文 Akka Cluster EC2 Autoscaling (which doesn't include Docker) and Akka Cluster in Docker 中的信息(其中不包括 EC2),我整理了一个 几乎完成 的解决方案。

最后一个障碍是节点之间的通信。每个节点都能正确识别对方的内部 IP。我假设 EC2 实例可以绕过 ECS 负载均衡器直接通信。

akka 节点正在侦听端口 2551。

/app # netstat -a
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 :::sunproxyadmin        :::*                    LISTEN
tcp        0      0 d0a81ebbe72a:2551       :::*                    LISTEN

docker 个实例正在公开端口 2551。

# docker ps
CONTAINER ID        IMAGE                                 COMMAND                  CREATED             STATUS              PORTS                    NAMES
d0a81ebbe72a        mystuff/potter:v1.0.7-cluster04       "sh -c 'java -jar -Xm"   About an hour ago   Up About an hour    0.0.0.0:2551->2551/tcp   ecs-awseb-maptiles-dev-uicd96apyp-6-potter-b8d6a7aef2c4c9c0a001
d6bc31f1798b        amazon/amazon-ecs-agent:latest        "/agent"                 About an hour ago   Up About an hour                             ecs-agent

EC2 实例有一个安全组,允许端口 2551 上的传入连接。

良 aws ec2 describe-instances --instance-ids "i-0750627a98ba930d4" "i-0bcd64a4121165327"|jq '.Reservations[].Instances[].SecurityGroups[]'
{
  "GroupName": "akka-remoting",
  "GroupId": "sg-6c267e16"
}
{
  "GroupName": "akka-remoting",
  "GroupId": "sg-6c267e16"
}

良 aws ec2 describe-security-groups --group-names akka-remoting | jq -c '.SecurityGroups[].IpPermissions'
[{"PrefixListIds":[],"FromPort":2551,"IpRanges":[{"CidrIp":"0.0.0.0/0"}],"ToPort":2551,"IpProtocol":"tcp","UserIdGroupPairs":[],"Ipv6Ranges":[{"CidrIpv6":"::/0"}]}]

良 aws ec2 describe-security-groups --group-names akka-remoting | jq -c '.SecurityGroups[].IpPermissionsEgress'
[{"PrefixListIds":[],"FromPort":2551,"IpRanges":[{"CidrIp":"0.0.0.0/0"}],"ToPort":2551,"IpProtocol":"tcp","UserIdGroupPairs":[],"Ipv6Ranges":[{"CidrIpv6":"::/0"}]}]

但是节点之间仍然看不到对方。

[INFO] [08/23/2017 23:31:37.227] [main] [akka.remote.Remoting] Starting remoting
[INFO] [08/23/2017 23:31:37.805] [main] [akka.remote.Remoting] Remoting started; listening on addresses :[akka.tcp://potter@172.31.12.161:2551]
[INFO] [08/23/2017 23:31:37.818] [main] [akka.cluster.Cluster(akka://potter)] Cluster Node [akka.tcp://potter@172.31.12.161:2551] - Starting up...
[INFO] [08/23/2017 23:31:37.867] [main] [akka.cluster.Cluster(akka://potter)] Cluster Node [akka.tcp://potter@172.31.12.161:2551] - Registered cluster JMX MBean [akka:type=Cluster]
[INFO] [08/23/2017 23:31:37.867] [main] [akka.cluster.Cluster(akka://potter)] Cluster Node [akka.tcp://potter@172.31.12.161:2551] - Started up successfully
[WARN] [08/23/2017 23:31:38.053] [New I/O boss #3] [NettyTransport(akka://potter)] Remote connection to [null] failed with java.net.ConnectException: Connection refused: /172.31.35.149:2551
[WARN] [08/23/2017 23:31:38.056] [potter-akka.remote.default-remote-dispatcher-7] [akka.tcp://potter@172.31.12.161:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fpotter%40172.31.35.149%3A2551-0] Association with remote system [akka.tcp://potter@172.31.35.149:2551] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://potter@172.31.35.149:2551]] Caused by: [Connection refused: /172.31.35.149:2551]

我有什么missed/misunderstood?

根据 AWS 论坛 post Single Docker Container with multiple open ports 中的建议,将 Akka 的 akka.remote.netty.tcp.bind-hostname 设置为 0.0.0.0 而不是启用本地 IP 的通信。