AWS:ALB 立即响应或在 130 秒后响应
AWS: ALB responding either immediately or after 130 seconds
我正在使用 ALB 并尝试利用基于主机的路由概念在它后面放置多个主机(与目标组进行 1-1 映射)。
所以我有 5 个网址,每个网址都被转发到不同的 TG(因此在我的例子中就是这样),例如
https://path1.mydomain.com
https://path2.mydomain.com
(...等等)...
我注意到我得到了二进制行为,即 ALB 将几乎立即响应(即 < 1.sec)或在大约 130 秒内响应。
$ for X in `seq 60`; do curl -Ik -w "HTTPCode=%{http_code} TotalTime=%{time_total}\n" https://path1.mydomain.com -so /dev/null; done
HTTPCode=200 TotalTime=130.157
HTTPCode=200 TotalTime=131.053
HTTPCode=200 TotalTime=131.050
HTTPCode=200 TotalTime=0.485
HTTPCode=200 TotalTime=130.533
HTTPCode=200 TotalTime=0.467
HTTPCode=200 TotalTime=130.586
HTTPCode=200 TotalTime=0.477
HTTPCode=200 TotalTime=130.567
这适用于所有路径。
知道这种行为可能来自哪里吗?
这是响应头(我总是得到200
,不管延迟如何)
$ curl -kI https://path1.mydomain.com
HTTP/1.1 200 OK
Date: Wed, 05 Dec 2018 17:03:14 GMT
Content-Type: text/html
Content-Length: 1617
Connection: keep-alive
Server: nginx
Last-Modified: Thu, 19 Jul 2018 09:52:09 GMT
ETag: "5b505f49-651"
Cache-Control: no-cache
Accept-Ranges: bytes
edit_1:虽然我为 ALB 注册了 2 subnets/AZs,但我所有的实例都在同一个 AZ/subnet 中,因为重要的是什么;
edit_2:当直接访问 public 个实例的 IP 之一时:
$ for X in `seq 60`; do curl -Ik -w "HTTPCode=%{http_code} TotalTime=%{time_total}\n" http://18.9.48.141 -so /dev/null; done
HTTPCode=200 TotalTime=0.005
HTTPCode=200 TotalTime=0.007
HTTPCode=200 TotalTime=0.005
HTTPCode=200 TotalTime=0.007
HTTPCode=200 TotalTime=0.007
HTTPCode=200 TotalTime=0.005
HTTPCode=200 TotalTime=0.005
HTTPCode=200 TotalTime=0.005
HTTPCode=200 TotalTime=0.010
HTTPCode=200 TotalTime=0.005
HTTPCode=200 TotalTime=0.005
HTTPCode=200 TotalTime=0.008
edit_3:不是 dns 解析问题,因为 dig
的 100 次迭代命令 returns 立即没有错误。
edit_4:这是curl
命令的strace
挂起的地方:
connect(4, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("51.53.132.130")}, 16) = -1 EINPROGRESS (Operation now in progress)
poll([{fd=4, events=POLLOUT|POLLWRNORM}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLOUT}], 1, 199) = 0 (Timeout)
poll([{fd=4, events=POLLOUT|POLLWRNORM}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLOUT}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLOUT|POLLWRNORM}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLOUT}], 1, 1000) = 0 (Timeout)
poll([{fd=4, events=POLLOUT|POLLWRNORM}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLOUT}], 1, 1000) = 0 (Timeout)
poll([{fd=4, events=POLLOUT|POLLWRNORM}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLOUT}], 1, 1000) = 0 (Timeout)
poll([{fd=4, events=POLLOUT|POLLWRNORM}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLOUT}], 1, 1000) = 0 (Timeout)
edit_5:对 public IP 的一些 tcptraceroute
迭代 curl
命令挂起
$ for i in `seq 10`; do sudo tcptraceroute 51.53.132.130; done
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 2.109 ms 2.097 ms 2.230 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 1.964 ms 1.954 ms 1.942 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 2.148 ms 2.220 ms 2.208 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 2.227 ms 2.214 ms 2.200 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 2.181 ms 2.170 ms 2.159 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 2.157 ms 2.221 ms 2.207 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 2.228 ms 2.216 ms 2.203 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 1.810 ms 1.962 ms 1.961 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 1.695 ms 1.757 ms 1.852 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 2.202 ms 2.187 ms 2.173 ms
很明显,巨大的延迟是由应用程序负载均衡器引入的。
问题如下:
我已经为我的 ALB 分配了 2 个 AZ,而与其中之一对应的子网没有 0.0.0.0/0 --> IG
路由。
我正在使用 ALB 并尝试利用基于主机的路由概念在它后面放置多个主机(与目标组进行 1-1 映射)。
所以我有 5 个网址,每个网址都被转发到不同的 TG(因此在我的例子中就是这样),例如
https://path1.mydomain.com
https://path2.mydomain.com
(...等等)...
我注意到我得到了二进制行为,即 ALB 将几乎立即响应(即 < 1.sec)或在大约 130 秒内响应。
$ for X in `seq 60`; do curl -Ik -w "HTTPCode=%{http_code} TotalTime=%{time_total}\n" https://path1.mydomain.com -so /dev/null; done
HTTPCode=200 TotalTime=130.157
HTTPCode=200 TotalTime=131.053
HTTPCode=200 TotalTime=131.050
HTTPCode=200 TotalTime=0.485
HTTPCode=200 TotalTime=130.533
HTTPCode=200 TotalTime=0.467
HTTPCode=200 TotalTime=130.586
HTTPCode=200 TotalTime=0.477
HTTPCode=200 TotalTime=130.567
这适用于所有路径。
知道这种行为可能来自哪里吗?
这是响应头(我总是得到200
,不管延迟如何)
$ curl -kI https://path1.mydomain.com
HTTP/1.1 200 OK
Date: Wed, 05 Dec 2018 17:03:14 GMT
Content-Type: text/html
Content-Length: 1617
Connection: keep-alive
Server: nginx
Last-Modified: Thu, 19 Jul 2018 09:52:09 GMT
ETag: "5b505f49-651"
Cache-Control: no-cache
Accept-Ranges: bytes
edit_1:虽然我为 ALB 注册了 2 subnets/AZs,但我所有的实例都在同一个 AZ/subnet 中,因为重要的是什么;
edit_2:当直接访问 public 个实例的 IP 之一时:
$ for X in `seq 60`; do curl -Ik -w "HTTPCode=%{http_code} TotalTime=%{time_total}\n" http://18.9.48.141 -so /dev/null; done
HTTPCode=200 TotalTime=0.005
HTTPCode=200 TotalTime=0.007
HTTPCode=200 TotalTime=0.005
HTTPCode=200 TotalTime=0.007
HTTPCode=200 TotalTime=0.007
HTTPCode=200 TotalTime=0.005
HTTPCode=200 TotalTime=0.005
HTTPCode=200 TotalTime=0.005
HTTPCode=200 TotalTime=0.010
HTTPCode=200 TotalTime=0.005
HTTPCode=200 TotalTime=0.005
HTTPCode=200 TotalTime=0.008
edit_3:不是 dns 解析问题,因为 dig
的 100 次迭代命令 returns 立即没有错误。
edit_4:这是curl
命令的strace
挂起的地方:
connect(4, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("51.53.132.130")}, 16) = -1 EINPROGRESS (Operation now in progress)
poll([{fd=4, events=POLLOUT|POLLWRNORM}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLOUT}], 1, 199) = 0 (Timeout)
poll([{fd=4, events=POLLOUT|POLLWRNORM}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLOUT}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLOUT|POLLWRNORM}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLOUT}], 1, 1000) = 0 (Timeout)
poll([{fd=4, events=POLLOUT|POLLWRNORM}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLOUT}], 1, 1000) = 0 (Timeout)
poll([{fd=4, events=POLLOUT|POLLWRNORM}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLOUT}], 1, 1000) = 0 (Timeout)
poll([{fd=4, events=POLLOUT|POLLWRNORM}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLOUT}], 1, 1000) = 0 (Timeout)
edit_5:对 public IP 的一些 tcptraceroute
迭代 curl
命令挂起
$ for i in `seq 10`; do sudo tcptraceroute 51.53.132.130; done
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 2.109 ms 2.097 ms 2.230 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 1.964 ms 1.954 ms 1.942 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 2.148 ms 2.220 ms 2.208 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 2.227 ms 2.214 ms 2.200 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 2.181 ms 2.170 ms 2.159 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 2.157 ms 2.221 ms 2.207 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 2.228 ms 2.216 ms 2.203 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 1.810 ms 1.962 ms 1.961 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 1.695 ms 1.757 ms 1.852 ms
traceroute to 51.53.132.130 (51.53.132.130), 30 hops max, 60 byte packets
1 * * *
2 51.53.132.130 (51.53.132.130) <syn,ack> 2.202 ms 2.187 ms 2.173 ms
很明显,巨大的延迟是由应用程序负载均衡器引入的。
问题如下:
我已经为我的 ALB 分配了 2 个 AZ,而与其中之一对应的子网没有 0.0.0.0/0 --> IG
路由。