如何在 Docker 集群中部署 aws-xray-daemon？

Question

我正在尝试将 amazon/aws-xray-daemon 部署到我的 docker 集群。

我在配置方面没有做太多，因为我在 the README.md

中看到的配置不多

services:
  xrayd:
    image: amazon/aws-xray-daemon
    deploy:
      restart_policy:
        delay: 2m

我在日志中得到以下内容

2021-02-27T04:50:38Z [Info] Initializing AWS X-Ray daemon 3.2.0
2021-02-27T04:50:38Z [Info] Using buffer memory limit of 78 MB
2021-02-27T04:50:38Z [Info] 1248 segment buffers allocated
2021-02-27T04:50:39Z [Error] Unable to retrieve the region from the EC2 instance EC2MetadataRequestError: failed to get EC2 instance identity document
caused by: RequestError: send request failed
caused by: Get http://169.254.169.254/latest/dynamic/instance-identity/document: dial tcp 169.254.169.254:80: connect: network is unreachable
2021-02-27T04:50:39Z [Error] Cannot fetch region variable from config file, environment variables and ec2 metadata.

我也在 IAM 中给了完整的 EC2 实例 xray:*。

Answer 1

我的理解是 X-Ray 守护程序无法获取 ec2 元数据 - https://github.com/aws/aws-xray-daemon/blob/7494caf05b6f5c1e8c9a59ebefc64b8f822983cd/pkg/conn/conn.go#L152. My recommendation would be to set region explicitly using AWS_REGION environment variable. Also, as a debugging step I would recommend to see if you're able to get ec2 meta data manually. You can follow this post (Find region from within an EC2 instance) 来检查您的 ec2 实例。

Answer 2

果然找到问题了。 docker stack deploy 不会更新 network.*.internal 设置。因此，即使我将其更改为 network.*.internal: false，它也从未接受更改。

我必须删除并重新部署堆栈才能使其正常工作。

services:
  xrayd:
    image: amazon/aws-xray-daemon
    # command: --log-level warn
    command: --log-level error
    networks:
      - xray
    logging:
      driver: none
    deploy:
      restart_policy:
        delay: 2m
        max_attempts: 2
networks:
  xray:
    internal: false
    attachable: false

如何在 Docker 集群中部署 aws-xray-daemon？

How do you deploy aws-xray-daemon in Docker swarm?

amazon-ec2

amazon-web-services

docker

docker-swarm

aws-xray