ajax 调用未在浏览器中显示任何内容

ajax calls not showing any content in browser

我正在尝试从此页面抓取艺术家 urls

https://myspace.com/discover/artists?genreId=1002532

但是此页面正在进行 ajax 调用以获取用户 deatils.I 可以在 firebug

中看到此 url
https://myspace.com/ajax/artistspage?chartType=heavyrotation&genreId=1002532&page=0

如果我在单独的选项卡中打开此 url,则不会显示任何内容,但如果我在 firebug 中查看它的响应选项卡,则会显示所有详细信息。

如何获取所有内容?

如果您查看 firebug 中的 https://myspace.com/ajax/artistspage?chartType=heavyrotation&genreId=1002532&page=0 when you try to go to it manually in the browser you will notice that it gets a 401 Unauthorized response. This is because the request headers are set in a special way when being requested from the official myspace page https://myspace.com/discover/artists?genreId=1002532 请求,则数据请求有效。当您的浏览器请求数据时,这些 header 不存在。

这是有效的 headers:

Accept:*/*
Accept-Encoding:gzip, deflate, sdch
Accept-Language:en-US,en;q=0.8
Cache-Control:no-cache
Client:persistentId=53065c06-c877-47c5-933a-4b22d7f28cd9&screenWidth=1440&screenHeight=900&timeZoneOffsetHours=7&visitId=31c9d922-9984-4ac5-9bb0-0bb253bc89c3&windowWidth=1043&windowHeight=407
Connection:keep-alive
Cookie:persistent_id=pid%3D53065c06-c877-47c5-933a-4b22d7f28cd9%26llid%3D%26lprid%3D%26lltime%3D; beacons_enabled=true; __utmt=1; ads=adInitVisit%3D1432446031357; player=sequenceId%3D-1%26paused%3Dtrue%26currentTime%3D0%26volume%3D0.5%26mute%3Dfalse%26shuffled%3Dfalse%26repeat%3Doff%26mode%3Dqueue%26radioEntity%3D%26radioMediaType%3D%26radioMediaId%3D%26radioCurrentTime%3D0%26pinned%3Dfalse%26streamStartDateTime%3D%26radioStreamStartDateTime%3D%26at%3D360%26incognito%3Dfalse%26allowSkips%3Dtrue%26ccOn%3Dfalse; visit_id=31c9d922-9984-4ac5-9bb0-0bb253bc89c3; __utma=102911388.1051160901.1432446029.1432446029.1432446029.1; __utmb=102911388.2.10.1432446029; __utmc=102911388; __utmz=102911388.1432446029.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)
DNT:1
Hash:NjI2YWM0YzM0YmJiZTg1NsKqwpMGw4HCuAvClMOGwoxAXMOXw50Qw5PCnH7DqVQIAygsY25wwrfCtsOcd8KuwqnCiMKSwobCrMKswpvDhEIrDcKYM0rCocKbJcKYEsKWw53Dr8KIwq7CgMKWw5XCo8KBGHVvURQKwpzDrMO9w5fDlsKzNhDChMOtw7wgw7NuDsK0wq1oC1sOOXAzK8KuwqdyEUDDnRk+w6BPwrIhfsKtw7Fewrcpa8Okw4c%3D
Host:myspace.com
Pragma:no-cache
Referer:https://myspace.com/discover/artists?genreId=1002532
User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2409.0 Safari/537.36
X-Requested-With:XMLHttpRequest

下面是无效的:

Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip, deflate, sdch
Accept-Language:en-US,en;q=0.8
Cache-Control:no-cache
Connection:keep-alive
Cookie:persistent_id=pid%3D53065c06-c877-47c5-933a-4b22d7f28cd9%26llid%3D%26lprid%3D%26lltime%3D; beacons_enabled=true; __utmt=1; ads=adInitVisit%3D1432446031357; __utma=102911388.1051160901.1432446029.1432446029.1432446029.1; __utmb=102911388.2.10.1432446029; __utmc=102911388; __utmz=102911388.1432446029.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); player=sequenceId=-1&paused=true&currentTime=0&volume=0.5&mute=false&shuffled=false&repeat=off&mode=queue&radioCurrentTime=0&pinned=false&at=360&incognito=false&allowSkips=true&ccOn=false; visit_id=31c9d922-9984-4ac5-9bb0-0bb253bc89c3
DNT:1
Host:myspace.com
Pragma:no-cache
User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2409.0 Safari/537.36

您会注意到存在一些差异,最重要的是有效请求 header 包括 Hash 以及 Referer header.我假设至少必须存在哈希才能由服务器验证。你必须找出这个哈希是如何在 myspace 页面上生成的,并且可能还设置 Referer 标记来伪造来自正确页面的请求。

如果您深入研究页面上的 JS,您会发现此代码段位于 https://x.myspacecdn.com/new/common/js/global.7A07230F0926F7451E2F85D8F2C647D0.min.js

a.setRequestHeader("Hash",context.hashMashter)

这是使用 context.hashMashter 设置哈希 header 的地方,如果您转到 https://x.myspacecdn.com/new/common/js/authentication.68B094D880713CC3A9EB77F984FC09F4.min.js,您可以看到它是使用以下代码段设置的:

context.hashMashter=a.hashMashter

我还不知道a是什么,但如果你想继续探索,我认为这是一个好的开始。