file_get_contents 从 google 获取的文件与浏览器中显示的不同

Question

我用file_get_contents来判断是否有我看的搜索URL:

如果我在浏览器中转到此 URL，则会显示一个不同的文件，然后我回显 file_get_contents

$url = "http://www.google.com/search?q=*a*+site:www.reddit.com/r/+-inurl:(/shirt/|/related/|/domain/|/new/|/top/|/controversial/|/widget/|/buttons/|/about/|/duplicates/|dest=|/i18n)&num=1&sort=date-sdate";
$google_search = file_get_contents($url);

我的代码有什么问题？

Answer 1

我假设 Google 肯定会检查用户代理以避免任何类型的自动搜索。

所以你至少应该使用 CURL 并定义一个适当的用户代理字符串（即与普通浏览器相同）到 "trick" Google.

不知怎么的，我担心欺骗不会那么容易Google，但也许我只是偏执狂，至少你可以学到一些关于 CURL 的东西。

Answer 2

真的没什么。问题是页面使用 javascript 和 ajax 来获取内容。所以，为了得到一个"snapshot"的页面，你需要"run it"。也就是说，您需要解析 javascript 代码，而 php 不会。

最好的办法是使用无头浏览器，例如 phantomjs。如果你搜索，你会找到一些教程来解释如何做

注意

如果您正在寻找一种从搜索中检索原始数据的方法，您可能想尝试使用 google's search api。

file_get_contents 从 google 获取的文件与浏览器中显示的不同

file_get_contents gets different file from google than shown in browser

php

url

file-get-contents