Akka.net 通过 Docker 容器进行远程处理:客户端随机无法连接到主机
Akka.net remoting over Docker containers: client randomly fails to connect to host
有一个带有 TestActor 的简单主机,它只将接收到的字符串写入控制台:
using (var actorSystem = ActorSystem.Create("host", HoconLoader.FromFile("config.hocon")))
{
var testActor = actorSystem.ActorOf(Props.Create<TestActor>(), "TestActor");
Console.WriteLine($"Waiting for requests...");
while (true)
{
Task.Delay(1000).Wait();
}
}
另一方面,有一个简单的客户端选择远程参与者并将 TestMessage 传递给它,然后在没有指定超时的情况下等待请求。
using (var actorSystem = ActorSystem.Create("client", HoconLoader.FromFile("config.hocon")))
{
var testActor = actorSystem.ActorSelection("akka.tcp://host@host:8081/user/TestActor");
Console.WriteLine($"Sending message...");
testActor.Ask(new TestMessage($"Message")).Wait();
Console.WriteLine($"Message ACKed.");
}
客户端和宿主机分别部署在两个Docker容器上(docker-compose),其网络配置如下(docker网络检查...):
[
{
"Name": "akkaremotetest_default",
"Id": "4995d7e340e09e4babcca7dc02ddf4f68f70761746c1246d66eaf7ee40ccec89",
"Created": "2018-07-21T07:55:39.3534215Z",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.19.0.0/16",
"Gateway": "172.19.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"6040c260c5195d2fe350bf3c89b5f9ede8a65d44da6adb48817fbef266a99e07": {
"Name": "akkaremotetest_host_1",
"EndpointID": "a6220a6fee071a29b83e30f9aeb9b9e7ec5008f04f593ff3fb2464477a7e54aa",
"MacAddress": "02:42:ac:13:00:02",
"IPv4Address": "172.19.0.2/16",
"IPv6Address": ""
},
"a97078c28c7d221c2c9af948fe36b72590251be69e06d0e66eafd2c74f416037": {
"Name": "akkaremotetest_client_1",
"EndpointID": "39bcb8b1047ad666d9c568ee968602b3a93edb4ac2151ba9c3f3c02359ef84f2",
"MacAddress": "02:42:ac:13:00:03",
"IPv4Address": "172.19.0.3/16",
"IPv6Address": ""
}
},
"Options": {},
"Labels": {}
}
]
容器启动后,结果为以下之一:
- 客户端Ask成功,actor将接收到的消息写入控制台,客户端确认成功,
- 客户端永远挂掉,actor一直没有收到消息,不会发生超时.
问题是后者大部分时间都会发生,但只有当主机和客户端部署在 Docker 容器上时才会发生。 运行独立时,没有通信问题。
我想我尝试了一切都没有结果,我不知道我还能做些什么来调查为什么客户端的 Ask 永远持续,而这两个 actor 系统中的任何一个都没有记录错误。
这里是 Docker 配置 (yml):
version: '2'
services:
host:
ports:
- 8081:8081
build:
context: .
dockerfile: Dockerfile
args:
PROJECT_DIR: Host
PROJECT_NAME: Host
WAIT_FOR_HOST: 0
restart: on-failure
client:
depends_on:
- host
ports:
- 8082:8082
build:
context: .
dockerfile: Dockerfile
args:
PROJECT_DIR: Client
PROJECT_NAME: Client
WAIT_FOR_HOST: 1
restart: on-failure
tcpdump:
image: kaazing/tcpdump
network_mode: "host"
volumes:
- ./tcpdump:/tcpdump
这里是客户端系统的配置(config.hocon):
akka {
actor {
provider = remote
}
remote {
dot-netty.tcp {
enabled-transports = ["akka.remote.netty.tcp"]
hostname = client
port = 8082
}
}
stdout-loglevel = DEBUG
loglevel = DEBUG
log-config-on-start = on
actor {
creation-timeout = 20s
debug {
receive = on
autoreceive = on
lifecycle = on
event-stream = on
unhandled = on
fsm = on
event-stream = on
log-sent-messages = on
log-received-messages = on
router-misconfiguration = on
}
}
}
这里是主机系统的配置(config.hocon):
akka {
actor {
provider = remote
}
remote {
dot-netty.tcp {
enabled-transports = ["akka.remote.netty.tcp"]
hostname = host
port = 8081
}
}
stdout-loglevel = DEBUG
loglevel = DEBUG
log-config-on-start = on
actor {
creation-timeout = 20s
debug {
receive = on
autoreceive = on
lifecycle = on
event-stream = on
unhandled = on
fsm = on
event-stream = on
log-sent-messages = on
log-received-messages = on
router-misconfiguration = on
}
}
}
根据有关 Akka remote configuration 的文档,我尝试像这样更改客户端配置:
remote {
dot-netty.tcp {
enabled-transports = ["akka.remote.netty.tcp"]
hostname = 172.19.0.3
port = 8082
bind-hostname = client
bind-port = 8082
}
}
主机配置类推:
remote {
dot-netty.tcp {
enabled-transports = ["akka.remote.netty.tcp"]
hostname = 172.19.0.2
port = 8081
bind-hostname = host
bind-port = 8081
}
}
演员选择也略有变化:
var testActor = actorSystem.ActorSelection("akka.tcp://host@172.19.0.2:8081/user/TestActor");
不幸的是,这根本没有帮助(没有任何改变)。
在此过程中生成的日志中,有一个关键条目是由主机系统生成的。只有出现才通信成功(但大多数情况下不会):
[DEBUG][07/21/2018 09:42:50][Thread 0006][remoting] Associated [akka.tcp://host@host:8081] <- akka.tcp://client@client:8082
如有任何帮助,我们将不胜感激。谢谢!
-- 编辑--
我在yml中添加了tcpdump部分,并在Wireshark中打开了生成的dump文件。我还为等待询问添加了 5 秒超时。我很难解释结果,但这是我在连接尝试失败时得到的结果:
172.19.0.3 -> 172.19.0.2: SYN
172.19.0.2 -> 172.19.0.3: SYN, ACK
172.19.0.3 -> 172.19.0.2: ACK
[a 5-second period of silence (waiting till timeout)]
172.19.0.3 -> 172.19.0.2: FIN, ACK
172.19.0.2 -> 172.19.0.3: ACK
172.19.0.2 -> 172.19.0.3: FIN, ACK
172.19.0.3 -> 172.19.0.2: ACK
这是连接成功时发生的情况:
172.19.0.3 -> 172.19.0.2: SYN
172.19.0.2 -> 172.19.0.3: SYN, ACK
172.19.0.3 -> 172.19.0.2: ACK
172.19.0.3 -> 172.19.0.2: PSH, ACK
172.19.0.2 -> 172.19.0.3: ACK
172.19.0.2 -> 172.19.0.3: PSH, ACK
172.19.0.3 -> 172.19.0.2: ACK
172.19.0.3 -> 172.19.0.2: PSH, ACK
版本:
- Akka.NET 1.3.8
- .NET 核心 2.1.1
- Docker 18.03.1-ce,构建 9ee9f40
- Docker-编写 1.21.1,构建 7641a569
根据this:
,事实证明问题源于项目依赖于 Akka 尚不支持的 .NET Core 2.1
We don't officially support .NET Core 2.1 yet. Heck, we aren't even on
netstandard 2.0 yet (although work is underway). But thanks for
confirming that there are indeed issues :)
切换到 .NET Core 2.0 后,我无法再重现描述的问题。
有一个带有 TestActor 的简单主机,它只将接收到的字符串写入控制台:
using (var actorSystem = ActorSystem.Create("host", HoconLoader.FromFile("config.hocon")))
{
var testActor = actorSystem.ActorOf(Props.Create<TestActor>(), "TestActor");
Console.WriteLine($"Waiting for requests...");
while (true)
{
Task.Delay(1000).Wait();
}
}
另一方面,有一个简单的客户端选择远程参与者并将 TestMessage 传递给它,然后在没有指定超时的情况下等待请求。
using (var actorSystem = ActorSystem.Create("client", HoconLoader.FromFile("config.hocon")))
{
var testActor = actorSystem.ActorSelection("akka.tcp://host@host:8081/user/TestActor");
Console.WriteLine($"Sending message...");
testActor.Ask(new TestMessage($"Message")).Wait();
Console.WriteLine($"Message ACKed.");
}
客户端和宿主机分别部署在两个Docker容器上(docker-compose),其网络配置如下(docker网络检查...):
[
{
"Name": "akkaremotetest_default",
"Id": "4995d7e340e09e4babcca7dc02ddf4f68f70761746c1246d66eaf7ee40ccec89",
"Created": "2018-07-21T07:55:39.3534215Z",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.19.0.0/16",
"Gateway": "172.19.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"6040c260c5195d2fe350bf3c89b5f9ede8a65d44da6adb48817fbef266a99e07": {
"Name": "akkaremotetest_host_1",
"EndpointID": "a6220a6fee071a29b83e30f9aeb9b9e7ec5008f04f593ff3fb2464477a7e54aa",
"MacAddress": "02:42:ac:13:00:02",
"IPv4Address": "172.19.0.2/16",
"IPv6Address": ""
},
"a97078c28c7d221c2c9af948fe36b72590251be69e06d0e66eafd2c74f416037": {
"Name": "akkaremotetest_client_1",
"EndpointID": "39bcb8b1047ad666d9c568ee968602b3a93edb4ac2151ba9c3f3c02359ef84f2",
"MacAddress": "02:42:ac:13:00:03",
"IPv4Address": "172.19.0.3/16",
"IPv6Address": ""
}
},
"Options": {},
"Labels": {}
}
]
容器启动后,结果为以下之一:
- 客户端Ask成功,actor将接收到的消息写入控制台,客户端确认成功,
- 客户端永远挂掉,actor一直没有收到消息,不会发生超时.
问题是后者大部分时间都会发生,但只有当主机和客户端部署在 Docker 容器上时才会发生。 运行独立时,没有通信问题。
我想我尝试了一切都没有结果,我不知道我还能做些什么来调查为什么客户端的 Ask 永远持续,而这两个 actor 系统中的任何一个都没有记录错误。
这里是 Docker 配置 (yml):
version: '2'
services:
host:
ports:
- 8081:8081
build:
context: .
dockerfile: Dockerfile
args:
PROJECT_DIR: Host
PROJECT_NAME: Host
WAIT_FOR_HOST: 0
restart: on-failure
client:
depends_on:
- host
ports:
- 8082:8082
build:
context: .
dockerfile: Dockerfile
args:
PROJECT_DIR: Client
PROJECT_NAME: Client
WAIT_FOR_HOST: 1
restart: on-failure
tcpdump:
image: kaazing/tcpdump
network_mode: "host"
volumes:
- ./tcpdump:/tcpdump
这里是客户端系统的配置(config.hocon):
akka {
actor {
provider = remote
}
remote {
dot-netty.tcp {
enabled-transports = ["akka.remote.netty.tcp"]
hostname = client
port = 8082
}
}
stdout-loglevel = DEBUG
loglevel = DEBUG
log-config-on-start = on
actor {
creation-timeout = 20s
debug {
receive = on
autoreceive = on
lifecycle = on
event-stream = on
unhandled = on
fsm = on
event-stream = on
log-sent-messages = on
log-received-messages = on
router-misconfiguration = on
}
}
}
这里是主机系统的配置(config.hocon):
akka {
actor {
provider = remote
}
remote {
dot-netty.tcp {
enabled-transports = ["akka.remote.netty.tcp"]
hostname = host
port = 8081
}
}
stdout-loglevel = DEBUG
loglevel = DEBUG
log-config-on-start = on
actor {
creation-timeout = 20s
debug {
receive = on
autoreceive = on
lifecycle = on
event-stream = on
unhandled = on
fsm = on
event-stream = on
log-sent-messages = on
log-received-messages = on
router-misconfiguration = on
}
}
}
根据有关 Akka remote configuration 的文档,我尝试像这样更改客户端配置:
remote {
dot-netty.tcp {
enabled-transports = ["akka.remote.netty.tcp"]
hostname = 172.19.0.3
port = 8082
bind-hostname = client
bind-port = 8082
}
}
主机配置类推:
remote {
dot-netty.tcp {
enabled-transports = ["akka.remote.netty.tcp"]
hostname = 172.19.0.2
port = 8081
bind-hostname = host
bind-port = 8081
}
}
演员选择也略有变化:
var testActor = actorSystem.ActorSelection("akka.tcp://host@172.19.0.2:8081/user/TestActor");
不幸的是,这根本没有帮助(没有任何改变)。
在此过程中生成的日志中,有一个关键条目是由主机系统生成的。只有出现才通信成功(但大多数情况下不会):
[DEBUG][07/21/2018 09:42:50][Thread 0006][remoting] Associated [akka.tcp://host@host:8081] <- akka.tcp://client@client:8082
如有任何帮助,我们将不胜感激。谢谢!
-- 编辑--
我在yml中添加了tcpdump部分,并在Wireshark中打开了生成的dump文件。我还为等待询问添加了 5 秒超时。我很难解释结果,但这是我在连接尝试失败时得到的结果:
172.19.0.3 -> 172.19.0.2: SYN
172.19.0.2 -> 172.19.0.3: SYN, ACK
172.19.0.3 -> 172.19.0.2: ACK
[a 5-second period of silence (waiting till timeout)]
172.19.0.3 -> 172.19.0.2: FIN, ACK
172.19.0.2 -> 172.19.0.3: ACK
172.19.0.2 -> 172.19.0.3: FIN, ACK
172.19.0.3 -> 172.19.0.2: ACK
这是连接成功时发生的情况:
172.19.0.3 -> 172.19.0.2: SYN
172.19.0.2 -> 172.19.0.3: SYN, ACK
172.19.0.3 -> 172.19.0.2: ACK
172.19.0.3 -> 172.19.0.2: PSH, ACK
172.19.0.2 -> 172.19.0.3: ACK
172.19.0.2 -> 172.19.0.3: PSH, ACK
172.19.0.3 -> 172.19.0.2: ACK
172.19.0.3 -> 172.19.0.2: PSH, ACK
版本:
- Akka.NET 1.3.8
- .NET 核心 2.1.1
- Docker 18.03.1-ce,构建 9ee9f40
- Docker-编写 1.21.1,构建 7641a569
根据this:
,事实证明问题源于项目依赖于 Akka 尚不支持的 .NET Core 2.1We don't officially support .NET Core 2.1 yet. Heck, we aren't even on netstandard 2.0 yet (although work is underway). But thanks for confirming that there are indeed issues :)
切换到 .NET Core 2.0 后,我无法再重现描述的问题。