为什么可以通过浏览器访问一个站点,写简单的代码来获取,但是得到403错误
Why can access a site through browser, wheares write simple code to fetch, but get 403 error
终点:https://quizlet.com/webapi/3.2/images/search?query=hello&perPage=2
你们可以尝试以 Incognito
的身份访问此页面,从我这边可以。所以我想我可以从该站点获取数据。
我尝试在 Javascirpt 中复制请求和 运行,Python。但是,它不起作用。我收到 403
错误。
我也尝试用Burp Suite
。我无法通过 Burp 的浏览器访问此站点。
此外,由于我尝试使用 incognito
,所以我认为它与 cookie 无关。
代码示例(JS):
import fetch from "node-fetch";
const response = await fetch(
"https://quizlet.com/webapi/3.2/images/search?query=hello&perPage=2",
{
headers: {
accept:
"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"accept-language": "en",
"cache-control": "no-cache",
pragma: "no-cache",
"sec-ch-ua":
'"Google Chrome";v="93", " Not;A Brand";v="99", "Chromium";v="93"',
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-platform": '"Linux"',
"sec-fetch-dest": "document",
"sec-fetch-mode": "navigate",
"sec-fetch-site": "none",
"sec-fetch-user": "?1",
"upgrade-insecure-requests": "1",
},
referrerPolicy: "strict-origin-when-cross-origin",
body: null,
method: "GET",
mode: "cors",
credentials: "include",
}
);
const data = await response.status;
console.log(data);
代码Python
import requests
headers = {
'authority': 'quizlet.com',
'pragma': 'no-cache',
'cache-control': 'no-cache',
'sec-ch-ua': '"Google Chrome";v="93", " Not;A Brand";v="99", "Chromium";v="93"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Linux"',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'sec-fetch-site': 'none',
'sec-fetch-mode': 'navigate',
'sec-fetch-user': '?1',
'sec-fetch-dest': 'document',
'accept-language': 'en',
'cookie': 'qi5=i2x3g7y1z9a6%3At3vMoQQig2yLcpN.HKWn; qtkn=7gT4DE7pN9URJ2AFDYeaVe; fs=qzkse0; app_session_id=9781a407-4f37-4c09-8e97-8156f182bb45; search_session=%7B%22search_session_id%22%3A%22-2379864199063990974614477b859794%22%2C%22query%22%3A%22overrated%22%2C%22version%22%3A%221.1.1%22%2C%22platform%22%3A%22WEB%22%2C%22depth%22%3Anull%2C%22target_object_type%22%3A%22QImage%22%7D; __cf_bm=cB7hRf6JbcOFZ2kvQ3W12V4bxXiIgn_kF3n87RcI0h0-1631877048-0-Ac+Hi0pATLgW5N3JjqYa7uc5W4ZfDLOumvmCQixWJIKdcVj7stciFh8cYFVTOpr+q5pM2Q7LrXC/LsffOB6Mh2E=; __cfruid=81f16a673e6117331dd4270b3f4f29111590d7d8-1631877048',
}
params = (
('query', 'hello'),
('perPage', '2'),
)
response = requests.get(
'https://quizlet.com/webapi/3.2/images/search', headers=headers, params=params)
# NB. Original query string below. It seems impossible to parse and
# reproduce query strings 100% accurately so the one below is given
# in case the reproduced version is not "correct".
# response = requests.get('https://quizlet.com/webapi/3.2/images/search?query=hello&perPage=2', headers=headers)
print(response.status_code)
请帮帮我。我什至不知道怎么会这样? (浏览器工作,而代码不工作)。还是谢谢了。
从 python 那边。我出于兴趣看了一下,因为我目前正在开发 REST API 并且很好奇他们是如何保护它的。
使用 Wireshark 时,python 中的“请求”模块似乎无法以与 Chrome/Firefox 相同的方式处理 http 请求,我怀疑他们正在使用它作为提供验证码的信号.
无论如何切换对 httpx 模块的请求;
pip install httpx
并更改 headers 以完全复制 Firefox;
import httpx
headers = [
('Accept','text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'),
('Accept-Encoding','gzip, deflate, br'),
('Accept-Language','en-GB,en;q=0.5'),
('Cache-Control','max-age=0'),
('Connection','keep-alive'),
('Host','quizlet.com'),
('Sec-Fetch-Dest','document'),
('Sec-Fetch-Mode','navigate'),
('Sec-Fetch-Site','none'),
('Sec-Fetch-User','?1'),
('TE','trailers'),
('Upgrade-Insecure-Requests','1'),
('User-Agent','Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0'),
]
params = (
('query', 'hello'),
('perPage', '2'),
)
response = httpx.get('https://quizlet.com/webapi/3.2/images/search', headers=headers, params=params,)
print(response.content)
为我提供以下与验证码页面相关的内容;
{
"responses": [{
"models": {
"image": [{
"id": 18957872,
"personId": 16641862,
"timestamp": 1416579222,
"lastModified": 1416579222,
"code": "Gfg5XS88MRmYq8RS",
"license": 1,
"width": 480,
"height": 360,
"flickrId": null,
"flickrOwner": null,
"_legacyUrl": "http://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA.gif",
"_legacyUrlSquare": "http://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_s.gif",
"_legacyUrlSmall": "http://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_m.gif",
"_secureLegacyUrl": "https://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA.gif",
"_secureLegacyUrlLarge": "https://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_b.gif",
"_secureLegacyUrlSquare": "https://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_s.gif",
"_secureLegacyUrlSmall": "https://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_m.gif"
}, {
"id": 9228314,
"personId": 513525,
"timestamp": 1406222781,
"lastModified": 1406222781,
"code": "bPHbzaV7KsGWfuXJ",
"license": 1,
"width": 298,
"height": 232,
"flickrId": null,
"flickrOwner": null,
"_legacyUrl": "http://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA.jpg",
"_legacyUrlSquare": "http://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_s.jpg",
"_legacyUrlSmall": "http://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_m.jpg",
"_secureLegacyUrl": "https://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA.jpg",
"_secureLegacyUrlLarge": "https://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_b.jpg",
"_secureLegacyUrlSquare": "https://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_s.jpg",
"_secureLegacyUrlSmall": "https://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_m.jpg"
}]
},
"paging": {
"total": 50,
"page": 1,
"perPage": 2,
"token": "UuKKKAkmxv.r4YtwFDuRevZVGAHr"
}
}]
}
终点:https://quizlet.com/webapi/3.2/images/search?query=hello&perPage=2
你们可以尝试以 Incognito
的身份访问此页面,从我这边可以。所以我想我可以从该站点获取数据。
我尝试在 Javascirpt 中复制请求和 运行,Python。但是,它不起作用。我收到 403
错误。
我也尝试用Burp Suite
。我无法通过 Burp 的浏览器访问此站点。
此外,由于我尝试使用 incognito
,所以我认为它与 cookie 无关。
代码示例(JS):
import fetch from "node-fetch";
const response = await fetch(
"https://quizlet.com/webapi/3.2/images/search?query=hello&perPage=2",
{
headers: {
accept:
"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"accept-language": "en",
"cache-control": "no-cache",
pragma: "no-cache",
"sec-ch-ua":
'"Google Chrome";v="93", " Not;A Brand";v="99", "Chromium";v="93"',
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-platform": '"Linux"',
"sec-fetch-dest": "document",
"sec-fetch-mode": "navigate",
"sec-fetch-site": "none",
"sec-fetch-user": "?1",
"upgrade-insecure-requests": "1",
},
referrerPolicy: "strict-origin-when-cross-origin",
body: null,
method: "GET",
mode: "cors",
credentials: "include",
}
);
const data = await response.status;
console.log(data);
代码Python
import requests
headers = {
'authority': 'quizlet.com',
'pragma': 'no-cache',
'cache-control': 'no-cache',
'sec-ch-ua': '"Google Chrome";v="93", " Not;A Brand";v="99", "Chromium";v="93"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Linux"',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'sec-fetch-site': 'none',
'sec-fetch-mode': 'navigate',
'sec-fetch-user': '?1',
'sec-fetch-dest': 'document',
'accept-language': 'en',
'cookie': 'qi5=i2x3g7y1z9a6%3At3vMoQQig2yLcpN.HKWn; qtkn=7gT4DE7pN9URJ2AFDYeaVe; fs=qzkse0; app_session_id=9781a407-4f37-4c09-8e97-8156f182bb45; search_session=%7B%22search_session_id%22%3A%22-2379864199063990974614477b859794%22%2C%22query%22%3A%22overrated%22%2C%22version%22%3A%221.1.1%22%2C%22platform%22%3A%22WEB%22%2C%22depth%22%3Anull%2C%22target_object_type%22%3A%22QImage%22%7D; __cf_bm=cB7hRf6JbcOFZ2kvQ3W12V4bxXiIgn_kF3n87RcI0h0-1631877048-0-Ac+Hi0pATLgW5N3JjqYa7uc5W4ZfDLOumvmCQixWJIKdcVj7stciFh8cYFVTOpr+q5pM2Q7LrXC/LsffOB6Mh2E=; __cfruid=81f16a673e6117331dd4270b3f4f29111590d7d8-1631877048',
}
params = (
('query', 'hello'),
('perPage', '2'),
)
response = requests.get(
'https://quizlet.com/webapi/3.2/images/search', headers=headers, params=params)
# NB. Original query string below. It seems impossible to parse and
# reproduce query strings 100% accurately so the one below is given
# in case the reproduced version is not "correct".
# response = requests.get('https://quizlet.com/webapi/3.2/images/search?query=hello&perPage=2', headers=headers)
print(response.status_code)
请帮帮我。我什至不知道怎么会这样? (浏览器工作,而代码不工作)。还是谢谢了。
从 python 那边。我出于兴趣看了一下,因为我目前正在开发 REST API 并且很好奇他们是如何保护它的。
使用 Wireshark 时,python 中的“请求”模块似乎无法以与 Chrome/Firefox 相同的方式处理 http 请求,我怀疑他们正在使用它作为提供验证码的信号.
无论如何切换对 httpx 模块的请求;
pip install httpx
并更改 headers 以完全复制 Firefox;
import httpx
headers = [
('Accept','text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'),
('Accept-Encoding','gzip, deflate, br'),
('Accept-Language','en-GB,en;q=0.5'),
('Cache-Control','max-age=0'),
('Connection','keep-alive'),
('Host','quizlet.com'),
('Sec-Fetch-Dest','document'),
('Sec-Fetch-Mode','navigate'),
('Sec-Fetch-Site','none'),
('Sec-Fetch-User','?1'),
('TE','trailers'),
('Upgrade-Insecure-Requests','1'),
('User-Agent','Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0'),
]
params = (
('query', 'hello'),
('perPage', '2'),
)
response = httpx.get('https://quizlet.com/webapi/3.2/images/search', headers=headers, params=params,)
print(response.content)
为我提供以下与验证码页面相关的内容;
{
"responses": [{
"models": {
"image": [{
"id": 18957872,
"personId": 16641862,
"timestamp": 1416579222,
"lastModified": 1416579222,
"code": "Gfg5XS88MRmYq8RS",
"license": 1,
"width": 480,
"height": 360,
"flickrId": null,
"flickrOwner": null,
"_legacyUrl": "http://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA.gif",
"_legacyUrlSquare": "http://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_s.gif",
"_legacyUrlSmall": "http://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_m.gif",
"_secureLegacyUrl": "https://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA.gif",
"_secureLegacyUrlLarge": "https://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_b.gif",
"_secureLegacyUrlSquare": "https://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_s.gif",
"_secureLegacyUrlSmall": "https://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_m.gif"
}, {
"id": 9228314,
"personId": 513525,
"timestamp": 1406222781,
"lastModified": 1406222781,
"code": "bPHbzaV7KsGWfuXJ",
"license": 1,
"width": 298,
"height": 232,
"flickrId": null,
"flickrOwner": null,
"_legacyUrl": "http://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA.jpg",
"_legacyUrlSquare": "http://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_s.jpg",
"_legacyUrlSmall": "http://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_m.jpg",
"_secureLegacyUrl": "https://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA.jpg",
"_secureLegacyUrlLarge": "https://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_b.jpg",
"_secureLegacyUrlSquare": "https://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_s.jpg",
"_secureLegacyUrlSmall": "https://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_m.jpg"
}]
},
"paging": {
"total": 50,
"page": 1,
"perPage": 2,
"token": "UuKKKAkmxv.r4YtwFDuRevZVGAHr"
}
}]
}