如何使用基本认证访问页面(Apify SDK)
How to access pages with basic authentication (Apify SDK)
在 puppeteer 文档中我发现我可以使用
await page.authenticate({ username: 'test', password: 'test' });
使用基本身份验证访问页面。
但是handlePageFunction好像已经完成了请求
那我该怎么做呢?
Apify.main(async () => {
const requestQueue = await Apify.openRequestQueue(`PC_${settings.project}_${time}`);
await requestQueue.addRequest({ url: settings.baseUrl });
const crawler = new Apify.PuppeteerCrawler({
requestQueue,
launchPuppeteerOptions: {
headless: settings.headless,
// slowMo: 500,
},
maxRequestsPerCrawl: settings.maxurls,
maxConcurrency: settings.maxcrawlers,
handlePageFunction: async ({ request, response, page }) => {
await page.authenticate({ username: 'test', password: 'test' });
await page.waitFor(settings.waitForPageload);
const requestUrl = request.url
const loadUrl = request.loadedUrl
let isRedirected = false
if (requestUrl !== loadUrl) {
isRedirected = { from: requestUrl, to: loadUrl }
}
您可以在使用 gotoFunction
打开页面之前对其进行操作。
如果你需要登录一个网站,你可以检查这个小login example
const crawler = new Apify.PuppeteerCrawler({
gotoFunction: async ({ page, request }) => {
await page.authenticate({ username: 'test', password: 'test' });
return page.goto(request.url, { timeout: 120000 });
},
在 puppeteer 文档中我发现我可以使用
await page.authenticate({ username: 'test', password: 'test' });
使用基本身份验证访问页面。
但是handlePageFunction好像已经完成了请求
那我该怎么做呢?
Apify.main(async () => {
const requestQueue = await Apify.openRequestQueue(`PC_${settings.project}_${time}`);
await requestQueue.addRequest({ url: settings.baseUrl });
const crawler = new Apify.PuppeteerCrawler({
requestQueue,
launchPuppeteerOptions: {
headless: settings.headless,
// slowMo: 500,
},
maxRequestsPerCrawl: settings.maxurls,
maxConcurrency: settings.maxcrawlers,
handlePageFunction: async ({ request, response, page }) => {
await page.authenticate({ username: 'test', password: 'test' });
await page.waitFor(settings.waitForPageload);
const requestUrl = request.url
const loadUrl = request.loadedUrl
let isRedirected = false
if (requestUrl !== loadUrl) {
isRedirected = { from: requestUrl, to: loadUrl }
}
您可以在使用 gotoFunction
打开页面之前对其进行操作。
如果你需要登录一个网站,你可以检查这个小login example
const crawler = new Apify.PuppeteerCrawler({
gotoFunction: async ({ page, request }) => {
await page.authenticate({ username: 'test', password: 'test' });
return page.goto(request.url, { timeout: 120000 });
},