AWS ELB 极其缓慢且不完整

AWS ELB is extremely slow and patchy

我已经设置了面向 Internet 的 ELB 来访问 Apache airflow 的网络服务器,它在实例的 8080 中运行。

配置

  1. 单可用区 ELB
  2. 具有单个 m4.large 实例的自动缩放组

下面是 ELB 的 terraform 资源

resource "aws_elb" "airflow_elb" {
  name = "${var.domain_name}-elb"
  subnets = [
    "${aws_subnet.private.id}"]

  security_groups = [
    "${aws_security_group.public.id}"]

  "listener" {
    instance_port = 8080
    instance_protocol = "http"
    lb_port = 80
    lb_protocol = "http"
  }

  health_check {
    healthy_threshold = "${var.elb_healthy_threshold}"
    interval = "${var.elb_interval}"
    target = "HTTP:8080/admin/"
    timeout = "${var.elb_timeout}"
    unhealthy_threshold = "${var.elb_unhealthy_threshold}"
  }

  access_logs {
    bucket = "${aws_s3_bucket.bucket.bucket}"
    bucket_prefix = "elb-logs"
    interval = 60
  }

  cross_zone_load_balancing = false
  idle_timeout = 400
  connection_draining = true
  connection_draining_timeout = 400



  tags {
    Name = "airflow-elb"
  }

}

我可以通过 bastion 主机通过 ssh 隧道连接到私有 ip 实例,并且门户可以正常工作。但是,当我通过 ELB 的 DNS 名称访问时,它要么非常慢,在这种情况下,我可以看到请求几乎是立即从网络服务器响应的,但需要永远加载或 ELB 抛出 HTTP 503

请帮忙!!

EDIT1:后端处理时间非常长,但我可以看到只有从 ELB 访问时才会发生这种情况,当从隧道连接完成时它表现正常。

假设您根据 AWS Documentation

使用经典 ELB

陈述的三个原因是:

原因 1:负载均衡器容量不足,无法处理请求。

原因 2: 没有注册实例。

原因 3: 没有健康的实例。

登录控制台,查看实例是否在ELB下注册,如果是,是否处于健康状态?

我也很好奇你为什么只用了一个AZ?

诊断 ELB 问题时的一些有用资源:

问题实际上是与 python3 一起使用同步工作器以及 ELB 如何重用 http 连接。从 sync worker 更改为 gevent 后,问题消失了。然而,python 3 目前还不支持 gevent,所以我们暂时使用 python 2.7

您可以试试这个答案:

Solution If you're DNS is configured to hit directly on the ELB -> you should reduce the TTL of the association (IP,DNS). The IP can change at any time with the ELB so you can have serious damage on your traffic.

The client keep Some IP from the ELB in cache so you can have those can of trouble.

Scaling Elastic Load Balancers Once you create an elastic load balancer, you must configure it to accept incoming traffic and route requests to your EC2 instances. These configuration parameters are stored by the controller, and the controller ensures that all of the load balancers are operating with the correct configuration. The controller will also monitor the load balancers and manage the capacity that is used to handle the client requests. It increases capacity by utilizing either larger resources (resources with higher performance characteristics) or more individual resources. The Elastic Load Balancing service will update the Domain Name System (DNS) record of the load balancer when it scales so that the new resources have their respective IP addresses registered in DNS. The DNS record that is created includes a Time-to-Live (TTL) setting of 60 seconds, with the expectation that clients will re-lookup the DNS at least every 60 seconds. By default, Elastic Load Balancing will return multiple IP addresses when clients perform a DNS resolution, with the records being randomly ordered on each DNS resolution request. As the traffic profile changes, the controller service will scale the load balancers to handle more requests, scaling equally in all Availability Zones.

就我而言,问题出在 TTL 中。可以通过 wget https://your-url 等命令跟踪问题。命令输出将向您显示它尝试连接的 IP 地址。当连接挂起时,您可以找出错误的过时 IP 地址。如果发生这种情况 - 检查您的 DNS 设置并更新 TTL。