尝试使用 python 请求测试某些 url 地址是否有效,但出现错误
Trying to test some url addresses is working or not with python request but getting errors
我正在尝试学习测试一些带有 python 请求的互联网地址,并期待一些输出(如 200 或 404)。但是我得到了我无法弄清楚的错误。我也愿意为我的目的接受任何建议。
import os , sys , requests
from multiprocessing import Pool
def url_check(url):
resp = requests.get(url)
print(resp.status_code)
with Pool(4) as p:
print(p.map(url_check, [ "https://api.github.com", "http://bilgisayar.mu.edu.tr/", "https://www.python.org/", "http://akrepnalan.com/ceng2034", "https://github.com/caesarsalad/wow" ]))
错误代码的输出:
404
404
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "ödev_deneme.py", line 6, in url_check
resp = requests.get(url)
File "/home/efe/.local/lib/python3.6/site-packages/requests/api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "/home/efe/.local/lib/python3.6/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/home/efe/.local/lib/python3.6/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/home/efe/.local/lib/python3.6/site-packages/requests/sessions.py", line 637, in send
adapter = self.get_adapter(url=request.url)
File "/home/efe/.local/lib/python3.6/site-packages/requests/sessions.py", line 728, in get_adapter
raise InvalidSchema("No connection adapters were found for {!r}".format(url))
requests.exceptions.InvalidSchema: No connection adapters were found for '\u200bhttps://www.python.org/\u200b'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "ödev_deneme.py", line 10, in <module>
print(p.map(url_check, [ "https://api.github.com", "http://bilgisayar.mu.edu.tr/", "https://www.python.org/", "http://akrepnalan.com/ceng2034", "https://github.com/caesarsalad/wow" ]))
File "/usr/lib/python3.6/multiprocessing/pool.py", line 266, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/lib/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
requests.exceptions.InvalidSchema: No connection adapters were found for '\u200bhttps://www.python.org/\u200b'
我期望的输出必须是这样的:
200
200
200
404
200
第四行是 404,因为第 url 地址无效。但是在我的输出中,前两行已经有 404。我猜我的代码中有一个巨大的错误。
问题是某些网址包含不可见的零宽度 SPACE 个字符 ('\u200b'
)。
您可以将它们替换为空字符串:
def url_check(url):
resp = requests.get(url.replace('\u200b', ''))
print(resp.status_code)
我正在尝试学习测试一些带有 python 请求的互联网地址,并期待一些输出(如 200 或 404)。但是我得到了我无法弄清楚的错误。我也愿意为我的目的接受任何建议。
import os , sys , requests
from multiprocessing import Pool
def url_check(url):
resp = requests.get(url)
print(resp.status_code)
with Pool(4) as p:
print(p.map(url_check, [ "https://api.github.com", "http://bilgisayar.mu.edu.tr/", "https://www.python.org/", "http://akrepnalan.com/ceng2034", "https://github.com/caesarsalad/wow" ]))
错误代码的输出:
404
404
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "ödev_deneme.py", line 6, in url_check
resp = requests.get(url)
File "/home/efe/.local/lib/python3.6/site-packages/requests/api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "/home/efe/.local/lib/python3.6/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/home/efe/.local/lib/python3.6/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/home/efe/.local/lib/python3.6/site-packages/requests/sessions.py", line 637, in send
adapter = self.get_adapter(url=request.url)
File "/home/efe/.local/lib/python3.6/site-packages/requests/sessions.py", line 728, in get_adapter
raise InvalidSchema("No connection adapters were found for {!r}".format(url))
requests.exceptions.InvalidSchema: No connection adapters were found for '\u200bhttps://www.python.org/\u200b'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "ödev_deneme.py", line 10, in <module>
print(p.map(url_check, [ "https://api.github.com", "http://bilgisayar.mu.edu.tr/", "https://www.python.org/", "http://akrepnalan.com/ceng2034", "https://github.com/caesarsalad/wow" ]))
File "/usr/lib/python3.6/multiprocessing/pool.py", line 266, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/lib/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
requests.exceptions.InvalidSchema: No connection adapters were found for '\u200bhttps://www.python.org/\u200b'
我期望的输出必须是这样的:
200
200
200
404
200
第四行是 404,因为第 url 地址无效。但是在我的输出中,前两行已经有 404。我猜我的代码中有一个巨大的错误。
问题是某些网址包含不可见的零宽度 SPACE 个字符 ('\u200b'
)。
您可以将它们替换为空字符串:
def url_check(url):
resp = requests.get(url.replace('\u200b', ''))
print(resp.status_code)