最大重试次数超过 url 由 NewConnectionError 引起无法建立新连接:[Errno -3] 名称解析暂时失败

Max retries exceeded with url Caused by NewConnectionError Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

我正在使用 python 请求库请求 API:

我的 python 脚本是调度程序每天 运行 一次,一旦 python 脚本得到 运行,我就得到这个错误和python 脚本被终止并显示 OOM。 我不知道是 DNS 问题还是 OOM(内存不足)问题,因为进程正在终止。

之前的脚本运行没问题。

任何 clues/help 都将受到高度赞赏。

Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connection.py", line 170, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw
  File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/util/connection.py", line 73, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/lib/python3.6/socket.py", line 745, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 706, in urlopen
    chunked=chunked,
  File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 382, in _make_request
    self._validate_conn(conn)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 1010, in _validate_conn
    conn.connect()
  File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connection.py", line 353, in connect
    conn = self._new_conn()
  File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connection.py", line 182, in _new_conn
    self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f163156c160>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 756, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/util/retry.py", line 574, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='<.............>', port=443): Max retries exceeded with url: /api/v2/test_connection/ (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f163156c160>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/pf_cic_etl_script_v2.py", line 50, in <module>
    resp = requests.get(url, headers=headers)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='<...............>', port=443): Max retries exceeded with url: /api/v2/test_connection/ (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f163156c160>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
Killed

进程终止日志-

dmesg -T| grep -E -i -B100 'killed process'
[Sat Sep 25 06:08:31 2021] Tasks state (memory values in pages):
[Sat Sep 25 06:08:31 2021] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[Sat Sep 25 06:08:31 2021] [   1643]  1000  1643   517804   455512  4157440        0             0 python3
[Sat Sep 25 06:08:31 2021] [   1651]     0  1651     5954       69    86016        0             0 apport
[Sat Sep 25 06:08:31 2021] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice,task=python3,pid=1643,uid=1000
[Sat Sep 25 06:08:31 2021] Out of memory: Killed process 1643 (python3) total-vm:2071216kB, anon-rss:1822048kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:4060kB oom_score_adj:0

我发现了问题,就我而言,这不是 DNS 问题。 该问题与 ec2 实例的 OOM(内存不足)有关,该实例正在终止 python 脚本的进程,因此“实例可达性检查失败”并且我收到“无法建立新连接” : [Errno -3] 名称解析暂时失败。

升级 ec2 实例后,实例可达性没有失败并且能够 运行 python 包含 api.

的脚本

https://aws.amazon.com/premiumsupport/knowledge-center/system-reachability-check/

实例状态检查失败表示实例的可达性存在问题。出现此问题的原因是操作系统级错误,例如:

无法启动操作系统 未能正确安装卷 精疲力尽 CPU 和内存 - 这发生在我们的案例中。 内核崩溃

当我尝试 运行 我的 letsencrypt 脚本时,我的 Hostinger 20 GB VPS 也出现了这样的错误。该问题与磁盘 运行ning out 直接相关。我 运行 下面的命令删除所有 docker 容器并要求 17.4 GB。之后问题已经解决:

sudo docker system prune --all