如何在 Docker 集群中部署 aws-xray-daemon?

How do you deploy aws-xray-daemon in Docker swarm?

我正在尝试将 amazon/aws-xray-daemon 部署到我的 docker 集群。

我在配置方面没有做太多,因为我在 the README.md

中看到的配置不多
services:
  xrayd:
    image: amazon/aws-xray-daemon
    deploy:
      restart_policy:
        delay: 2m

我在日志中得到以下内容

2021-02-27T04:50:38Z [Info] Initializing AWS X-Ray daemon 3.2.0
2021-02-27T04:50:38Z [Info] Using buffer memory limit of 78 MB
2021-02-27T04:50:38Z [Info] 1248 segment buffers allocated
2021-02-27T04:50:39Z [Error] Unable to retrieve the region from the EC2 instance EC2MetadataRequestError: failed to get EC2 instance identity document
caused by: RequestError: send request failed
caused by: Get http://169.254.169.254/latest/dynamic/instance-identity/document: dial tcp 169.254.169.254:80: connect: network is unreachable
2021-02-27T04:50:39Z [Error] Cannot fetch region variable from config file, environment variables and ec2 metadata.

我也在 IAM 中给了完整的 EC2 实例 xray:*

我的理解是 X-Ray 守护程序无法获取 ec2 元数据 - https://github.com/aws/aws-xray-daemon/blob/7494caf05b6f5c1e8c9a59ebefc64b8f822983cd/pkg/conn/conn.go#L152. My recommendation would be to set region explicitly using AWS_REGION environment variable. Also, as a debugging step I would recommend to see if you're able to get ec2 meta data manually. You can follow this post (Find region from within an EC2 instance) 来检查您的 ec2 实例。

果然找到问题了。 docker stack deploy 不会更新 network.*.internal 设置。因此,即使我将其更改为 network.*.internal: false,它也从未接受更改。

我必须删除并重新部署堆栈才能使其正常工作。

services:
  xrayd:
    image: amazon/aws-xray-daemon
    # command: --log-level warn
    command: --log-level error
    networks:
      - xray
    logging:
      driver: none
    deploy:
      restart_policy:
        delay: 2m
        max_attempts: 2
networks:
  xray:
    internal: false
    attachable: false