Python:Foursquare API 和 Requests 需要 cookie 和 javascript
Python: Foursquare API and Requests requires cookies and javascript
问题
我正在尝试联系 Foursquare API,特别是 checkin/resolve 端点。在过去这行得通,但最近我被一条错误消息阻止,说我是一个机器人,并且无法读取 cookie 和 javascript。
代码
response = "Swarmapp URL" # from previous functions, this isn't the problem
checkin_id = response.split("c/")[1] # To get shortID
url = "https://api.foursquare.com/v2/checkins/resolve"
params = dict(
client_id = "client_id",
client_secret = "client_secret",
shortId = checkin_id,
v = "20180323")
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'}
time.sleep(8.5) # Limit of 500 requests an hour
resp = requests.get(url = url, params=params, headers = headers)
data = json.loads(resp.text)
此代码适用于大约 30-40 个请求,然后出现错误和 return 一个 HTML 文件,包括:"Please verify you are human"、"Access to this page has been denied because we believe you are using automation tools to browse the website."、"Your browser does not support cookies"等等。
我已经尝试使用谷歌搜索和搜索此站点以查找类似的错误,但找不到任何有用的信息。 Foursquare API 对此也没有任何说明。
有什么建议吗?
回答
根据 Foursquare API 文档,这段代码应该有效:
import json, requests
url = 'https://api.foursquare.com/v2/checkins/resolve'
params = dict(
client_id='CLIENT_ID',
client_secret='CLIENT_SECRET',
v='20180323',
shortId = 'swarmPostID'
)
resp = requests.get(url=url, params=params)
data = json.loads(resp.text)
然而,Foursquare 使用的机器人检测显然与 API 的功能相矛盾。我发现使用等待计时器实现 try except
捕获解决了这个问题。
import json, requests
url = 'https://api.foursquare.com/v2/checkins/resolve'
params = dict(
client_id='CLIENT_ID',
client_secret='CLIENT_SECRET',
v='20180323',
shortId = 'swarmPostID'
)
try:
resp = requests.get(url=url, params=params)
except:
time.sleep(60) # Avoids bot detection
resp = requests.get(url=url, params=params)
try:
resp = requests.get(url=url, params=params)
except:
print("Post is private or deleted.")
continue
data = json.loads(resp.text)
这似乎是一个非常奇怪的修复。 Foursquare 实施了与其自身功能相矛盾的 DDoS 预防系统,或者其 checkin/resolve
端点已损坏。无论哪种方式,代码都有效。
问题
我正在尝试联系 Foursquare API,特别是 checkin/resolve 端点。在过去这行得通,但最近我被一条错误消息阻止,说我是一个机器人,并且无法读取 cookie 和 javascript。
代码
response = "Swarmapp URL" # from previous functions, this isn't the problem
checkin_id = response.split("c/")[1] # To get shortID
url = "https://api.foursquare.com/v2/checkins/resolve"
params = dict(
client_id = "client_id",
client_secret = "client_secret",
shortId = checkin_id,
v = "20180323")
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'}
time.sleep(8.5) # Limit of 500 requests an hour
resp = requests.get(url = url, params=params, headers = headers)
data = json.loads(resp.text)
此代码适用于大约 30-40 个请求,然后出现错误和 return 一个 HTML 文件,包括:"Please verify you are human"、"Access to this page has been denied because we believe you are using automation tools to browse the website."、"Your browser does not support cookies"等等。
我已经尝试使用谷歌搜索和搜索此站点以查找类似的错误,但找不到任何有用的信息。 Foursquare API 对此也没有任何说明。
有什么建议吗?
回答 根据 Foursquare API 文档,这段代码应该有效:
import json, requests
url = 'https://api.foursquare.com/v2/checkins/resolve'
params = dict(
client_id='CLIENT_ID',
client_secret='CLIENT_SECRET',
v='20180323',
shortId = 'swarmPostID'
)
resp = requests.get(url=url, params=params)
data = json.loads(resp.text)
然而,Foursquare 使用的机器人检测显然与 API 的功能相矛盾。我发现使用等待计时器实现 try except
捕获解决了这个问题。
import json, requests
url = 'https://api.foursquare.com/v2/checkins/resolve'
params = dict(
client_id='CLIENT_ID',
client_secret='CLIENT_SECRET',
v='20180323',
shortId = 'swarmPostID'
)
try:
resp = requests.get(url=url, params=params)
except:
time.sleep(60) # Avoids bot detection
resp = requests.get(url=url, params=params)
try:
resp = requests.get(url=url, params=params)
except:
print("Post is private or deleted.")
continue
data = json.loads(resp.text)
这似乎是一个非常奇怪的修复。 Foursquare 实施了与其自身功能相矛盾的 DDoS 预防系统,或者其 checkin/resolve
端点已损坏。无论哪种方式,代码都有效。