ZMQ 套接字在 Kubernetes 上无法正常工作
ZMQ sockets do not work as expected on Kubernetes
简短摘要:当我在 kubernetes 上部署代码时,我的 ZMQ 套接字不接收(可能也发送)消息。
我有一个涉及多个客户端和服务器的应用程序。它在node上开发,使用ZeroMQ作为通信层。它在我的本地机器上工作,它在 docker 上工作,我正在尝试使用 kubernetes 部署应用程序。
部署应用程序时,pods、部署和 kubernete 的服务启动。显然一切正常,但客户端发送的初始消息永远不会到达服务器。部署在同一个命名空间中,我使用 Flannel 作为 CNI。据我所知,集群已正确初始化,但消息从未到达。
我读到这篇 关于 ZMQ 套接字在 kubernetes 上绑定问题的文章。我试过使用 ZMQ_CONNECT_TIMEOUT 参数,但它没有做任何事情。另外,与我引用的问题不同,我的消息永远不会到达。
我可以提供一些代码,但代码很多,我认为应用程序没有问题。我想我在 Kubernetes 配置上遗漏了一些东西,因为这是我第一次使用它。如果您需要更多信息,请告诉我。
编辑 1。12/01/2021
正如@anemyte 建议的那样,我将尝试提供代码的简化版本:
客户端:
initiate () {
return new Promise(resolve => {
this.N_INCOMING = 0;
this.N_OUTGOING = 0;
this.rrCounter = 0;
this.PULL_SOCKET.bind("tcp://*:" + this.MY_PORT, error => {
utils.handleError(error);
this.PUB_SOCKET.bind("tcp://*:" + (this.MY_PORT + 1), error => {
utils.handleError(error);
this.SUB_SOCKET.subscribe("");
this.SUB_SOCKET.connect(this.SERVER + ":" + (this.SERVER_PORT + 1),
error => {utils.handleError(error)});
this.PULL_SOCKET.on("message", (m) => this.handlePullSocket(m));
this.SUB_SOCKET.on("message", (m) => this.handleSubSocket(m));
this.SERVER_PUSH_SOCKET = zmq.socket("push");
this.SERVER_PUSH_SOCKET.connect(this.SERVER + ":" + this.SERVER_PORT,
error => {utils.handleError(error)});
this.sendHello();
resolve();
});
});
});
服务器端:
initiate () {
return new Promise(resolve => {
this.PULL_SOCKET.bind(this.MY_IP + ":" + this.MY_PORT, err => {
if (err) {
console.log(err);
process.exit(0);
}
this.PUB_SOCKET.bind(this.MY_IP + ":" + (this.MY_PORT + 1), err => {
if (err) {
console.log(err);
process.exit(0);
}
this.PULL_SOCKET.on("message", (m) => this.handlePullSocket(m));
resolve();
});
});
});
}
客户端通过发送问候消息发起连接。服务器的侦听器函数 handlePullSocket
应该处理这些消息。
编辑 2。12/01/2021
根据要求,我正在添加 deployment/service 配置。
客户-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
kompose.cmd: kompose convert -f docker-compose-resolved.yml
kompose.version: 1.19.0 (f63a961c)
creationTimestamp: null
labels:
io.kompose.service: c1
name: c1
namespace: fishtrace
spec:
replicas: 1
selector:
matchLabels:
app: c1
strategy:
type: Recreate
template:
metadata:
annotations:
kompose.cmd: kompose convert -f docker-compose-resolved.yml
kompose.version: 1.19.0 (f63a961c)
creationTimestamp: null
labels:
app: c1
spec:
containers:
- env:
- name: NODO_ADDRESS
value: 0.0.0.0
- name: NODO_PUERTO
value: "9999"
- name: NODO_PUERTO_CADENA
value: "8888"
- name: SERVER_ADDRESS
value: tcp://servidor
- name: SERVER_PUERTO
value: "7777"
image: registrogeminis.com/docker_c1_rpi:latest
name: c1
ports:
- containerPort: 8888
- containerPort: 9999
resources: {}
volumeMounts:
- mountPath: /app/vol
name: c1-volume
imagePullPolicy: Always
restartPolicy: Always
imagePullSecrets:
- name: myregistrykey
volumes:
- name: c1-volume
persistentVolumeClaim:
claimName: c1-volume
status: {}
客户-service.yaml
apiVersion: v1
kind: Service
metadata:
annotations:
kompose.cmd: ./kompose convert
kompose.version: 1.22.0 (955b78124)
creationTimestamp: null
labels:
io.kompose.service: c1
name: c1
spec:
ports:
- name: "9999"
port: 9999
targetPort: 9999
- name: "8888"
port: 8888
targetPort: 8888
selector:
app: c1
type: ClusterIP
服务器-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
kompose.cmd: kompose convert -f docker-compose-resolved.yml
kompose.version: 1.19.0 (f63a961c)
creationTimestamp: null
labels:
io.kompose.service: servidor
name: servidor
namespace: fishtrace
spec:
replicas: 1
selector:
matchLabels:
app: servidor
strategy:
type: Recreate
template:
metadata:
annotations:
kompose.cmd: kompose convert -f docker-compose-resolved.yml
kompose.version: 1.19.0 (f63a961c)
creationTimestamp: null
labels:
app: servidor
spec:
containers:
- env:
- name: SERVER_ADDRESS
value: tcp://*
- name: SERVER_PUERTO
value: "7777"
image: registrogeminis.com/docker_servidor_rpi:latest
name: servidor
ports:
- containerPort: 7777
- containerPort: 7778
resources: {}
volumeMounts:
- mountPath: /app/vol
name: servidor-volume
imagePullPolicy: Always
restartPolicy: Always
imagePullSecrets:
- name: myregistrykey
volumes:
- name: servidor-volume
persistentVolumeClaim:
claimName: servidor-volume
status: {}
服务器-service.yaml
apiVersion: v1
kind: Service
metadata:
annotations:
kompose.cmd: ./kompose convert
kompose.version: 1.22.0 (955b78124)
creationTimestamp: null
labels:
io.kompose.service: servidor
name: servidor
spec:
ports:
- name: "7777"
port: 7777
targetPort: 7777
selector:
app: servidor
type: ClusterIP
最后是DNS问题。这始终是 DNS 问题。感谢@Matt 指出问题。
在Kubernetes官方中DNS doc they state that there is a known issue with systems that use /etc/resolv.conf
as a link to the real configuration file, /run/systemd/resolve/resolv.conf
in my case. It is a well-known problem, and the recommended solution是更新kubelet的配置指向/run/systemd/resolve/resolv.conf
。
为此,我在 /var/lib/kubelet/config.yaml
中添加了行 resolvConf:/run/systemd/resolve/resolv.conf
。为了确定,我还编辑了 /etc/kubernetes/kubelet.conf
。最后,您应该重新加载执行 sudo systemctl daemon-reload && sudo systemctl restart kubelet
的服务以传播更改。
但是,我在 SE 中询问之前已经这样做了。它似乎没有用。 我不得不重新启动整个集群 以使更改生效。然后 DNS 完美运行,ZMQ 套接字按预期运行。
更新 31/04/2021:我发现您必须强制重启 coredns kubernetes 的服务才能实际传播更改。所以最后kubectl rollout restart coredns -n kube-system
重启kubelet服务后就可以了
简短摘要:当我在 kubernetes 上部署代码时,我的 ZMQ 套接字不接收(可能也发送)消息。
我有一个涉及多个客户端和服务器的应用程序。它在node上开发,使用ZeroMQ作为通信层。它在我的本地机器上工作,它在 docker 上工作,我正在尝试使用 kubernetes 部署应用程序。
部署应用程序时,pods、部署和 kubernete 的服务启动。显然一切正常,但客户端发送的初始消息永远不会到达服务器。部署在同一个命名空间中,我使用 Flannel 作为 CNI。据我所知,集群已正确初始化,但消息从未到达。
我读到这篇
我可以提供一些代码,但代码很多,我认为应用程序没有问题。我想我在 Kubernetes 配置上遗漏了一些东西,因为这是我第一次使用它。如果您需要更多信息,请告诉我。
编辑 1。12/01/2021
正如@anemyte 建议的那样,我将尝试提供代码的简化版本:
客户端:
initiate () {
return new Promise(resolve => {
this.N_INCOMING = 0;
this.N_OUTGOING = 0;
this.rrCounter = 0;
this.PULL_SOCKET.bind("tcp://*:" + this.MY_PORT, error => {
utils.handleError(error);
this.PUB_SOCKET.bind("tcp://*:" + (this.MY_PORT + 1), error => {
utils.handleError(error);
this.SUB_SOCKET.subscribe("");
this.SUB_SOCKET.connect(this.SERVER + ":" + (this.SERVER_PORT + 1),
error => {utils.handleError(error)});
this.PULL_SOCKET.on("message", (m) => this.handlePullSocket(m));
this.SUB_SOCKET.on("message", (m) => this.handleSubSocket(m));
this.SERVER_PUSH_SOCKET = zmq.socket("push");
this.SERVER_PUSH_SOCKET.connect(this.SERVER + ":" + this.SERVER_PORT,
error => {utils.handleError(error)});
this.sendHello();
resolve();
});
});
});
服务器端:
initiate () {
return new Promise(resolve => {
this.PULL_SOCKET.bind(this.MY_IP + ":" + this.MY_PORT, err => {
if (err) {
console.log(err);
process.exit(0);
}
this.PUB_SOCKET.bind(this.MY_IP + ":" + (this.MY_PORT + 1), err => {
if (err) {
console.log(err);
process.exit(0);
}
this.PULL_SOCKET.on("message", (m) => this.handlePullSocket(m));
resolve();
});
});
});
}
客户端通过发送问候消息发起连接。服务器的侦听器函数 handlePullSocket
应该处理这些消息。
编辑 2。12/01/2021 根据要求,我正在添加 deployment/service 配置。
客户-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
kompose.cmd: kompose convert -f docker-compose-resolved.yml
kompose.version: 1.19.0 (f63a961c)
creationTimestamp: null
labels:
io.kompose.service: c1
name: c1
namespace: fishtrace
spec:
replicas: 1
selector:
matchLabels:
app: c1
strategy:
type: Recreate
template:
metadata:
annotations:
kompose.cmd: kompose convert -f docker-compose-resolved.yml
kompose.version: 1.19.0 (f63a961c)
creationTimestamp: null
labels:
app: c1
spec:
containers:
- env:
- name: NODO_ADDRESS
value: 0.0.0.0
- name: NODO_PUERTO
value: "9999"
- name: NODO_PUERTO_CADENA
value: "8888"
- name: SERVER_ADDRESS
value: tcp://servidor
- name: SERVER_PUERTO
value: "7777"
image: registrogeminis.com/docker_c1_rpi:latest
name: c1
ports:
- containerPort: 8888
- containerPort: 9999
resources: {}
volumeMounts:
- mountPath: /app/vol
name: c1-volume
imagePullPolicy: Always
restartPolicy: Always
imagePullSecrets:
- name: myregistrykey
volumes:
- name: c1-volume
persistentVolumeClaim:
claimName: c1-volume
status: {}
客户-service.yaml
apiVersion: v1
kind: Service
metadata:
annotations:
kompose.cmd: ./kompose convert
kompose.version: 1.22.0 (955b78124)
creationTimestamp: null
labels:
io.kompose.service: c1
name: c1
spec:
ports:
- name: "9999"
port: 9999
targetPort: 9999
- name: "8888"
port: 8888
targetPort: 8888
selector:
app: c1
type: ClusterIP
服务器-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
kompose.cmd: kompose convert -f docker-compose-resolved.yml
kompose.version: 1.19.0 (f63a961c)
creationTimestamp: null
labels:
io.kompose.service: servidor
name: servidor
namespace: fishtrace
spec:
replicas: 1
selector:
matchLabels:
app: servidor
strategy:
type: Recreate
template:
metadata:
annotations:
kompose.cmd: kompose convert -f docker-compose-resolved.yml
kompose.version: 1.19.0 (f63a961c)
creationTimestamp: null
labels:
app: servidor
spec:
containers:
- env:
- name: SERVER_ADDRESS
value: tcp://*
- name: SERVER_PUERTO
value: "7777"
image: registrogeminis.com/docker_servidor_rpi:latest
name: servidor
ports:
- containerPort: 7777
- containerPort: 7778
resources: {}
volumeMounts:
- mountPath: /app/vol
name: servidor-volume
imagePullPolicy: Always
restartPolicy: Always
imagePullSecrets:
- name: myregistrykey
volumes:
- name: servidor-volume
persistentVolumeClaim:
claimName: servidor-volume
status: {}
服务器-service.yaml
apiVersion: v1
kind: Service
metadata:
annotations:
kompose.cmd: ./kompose convert
kompose.version: 1.22.0 (955b78124)
creationTimestamp: null
labels:
io.kompose.service: servidor
name: servidor
spec:
ports:
- name: "7777"
port: 7777
targetPort: 7777
selector:
app: servidor
type: ClusterIP
最后是DNS问题。这始终是 DNS 问题。感谢@Matt 指出问题。
在Kubernetes官方中DNS doc they state that there is a known issue with systems that use /etc/resolv.conf
as a link to the real configuration file, /run/systemd/resolve/resolv.conf
in my case. It is a well-known problem, and the recommended solution是更新kubelet的配置指向/run/systemd/resolve/resolv.conf
。
为此,我在 /var/lib/kubelet/config.yaml
中添加了行 resolvConf:/run/systemd/resolve/resolv.conf
。为了确定,我还编辑了 /etc/kubernetes/kubelet.conf
。最后,您应该重新加载执行 sudo systemctl daemon-reload && sudo systemctl restart kubelet
的服务以传播更改。
但是,我在 SE 中询问之前已经这样做了。它似乎没有用。 我不得不重新启动整个集群 以使更改生效。然后 DNS 完美运行,ZMQ 套接字按预期运行。
更新 31/04/2021:我发现您必须强制重启 coredns kubernetes 的服务才能实际传播更改。所以最后kubectl rollout restart coredns -n kube-system
重启kubelet服务后就可以了