我可以通过自定义搜索 api 检索附加链接吗?
Can I retrieve Sitelinks through custom Search api?
我想抓取 google 搜索结果中显示的附加链接(例如关于我们主页等)。有什么办法可以取回它们吗?
enter image description here
我最近实现了 Google 搜索 JSON API,据我了解,获取网站链接的唯一方法是通过每个结果的 JSON 回调包含 formattedUrl 或 htmlFormattedUrl。查询将是有问题的网站,希望第一个结果会为您提供该网站的相关链接。
但是,如果我正确理解了你的问题,你想删除给定网站的子链接,这是一个 web crawler would do. If you are the owner of the website, you can create a sitemap using many tools around the web, but if your intentions can be classified as "other", then I believe that you are barking at the wrong tree. See this question 会精确定位你创建一个简单的 WebCrawler。
// 查询为 Deovandski.
的示例 customsearch#result 项目
"items": [
{
"kind": "customsearch#result",
"title": "Student Experience - College of Science and Mathematics (NDSU)",
"htmlTitle": "Student Experience - College of Science and Mathematics (NDSU)",
"link": "https://www.ndsu.edu/scimath/currentstudents/student_experience/",
"displayLink": "www.ndsu.edu",
"snippet": "Sep 16, 2015 ... Association for Computing Machinery Student Chapter Chair: Jordan Goetze \nAdvisor: Brian Slator. Upsilon Pi Epsilon President: Deovandski ...",
"htmlSnippet": "Sep 16, 2015 \u003cb\u003e...\u003c/b\u003e Association for Computing Machinery Student Chapter Chair: Jordan Goetze \u003cbr\u003e\nAdvisor: Brian Slator. Upsilon Pi Epsilon President: \u003cb\u003eDeovandski\u003c/b\u003e ...",
"cacheId": "pyzF9XJwrXsJ",
"formattedUrl": "https://www.ndsu.edu/scimath/currentstudents/student_experience/",
"htmlFormattedUrl": "https://www.ndsu.edu/scimath/currentstudents/student_experience/",
"pagemap": {
"cse_image": [
{
"src": "https://www.ndsu.edu/fileadmin/_processed_/csm_080117_anatomy_03med_9dbc3c8cce.jpg"
}
],
"cse_thumbnail": [
{
"width": "184",
"height": "275",
"src": "https://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcTTL-GZRfSv30cyESsCnd_65BFoLMDdo8fqNS58mHfRbGiOTjSq-e-o28FE"
}
]
}
},
我想抓取 google 搜索结果中显示的附加链接(例如关于我们主页等)。有什么办法可以取回它们吗? enter image description here
我最近实现了 Google 搜索 JSON API,据我了解,获取网站链接的唯一方法是通过每个结果的 JSON 回调包含 formattedUrl 或 htmlFormattedUrl。查询将是有问题的网站,希望第一个结果会为您提供该网站的相关链接。
但是,如果我正确理解了你的问题,你想删除给定网站的子链接,这是一个 web crawler would do. If you are the owner of the website, you can create a sitemap using many tools around the web, but if your intentions can be classified as "other", then I believe that you are barking at the wrong tree. See this question 会精确定位你创建一个简单的 WebCrawler。
// 查询为 Deovandski.
的示例 customsearch#result 项目 "items": [
{
"kind": "customsearch#result",
"title": "Student Experience - College of Science and Mathematics (NDSU)",
"htmlTitle": "Student Experience - College of Science and Mathematics (NDSU)",
"link": "https://www.ndsu.edu/scimath/currentstudents/student_experience/",
"displayLink": "www.ndsu.edu",
"snippet": "Sep 16, 2015 ... Association for Computing Machinery Student Chapter Chair: Jordan Goetze \nAdvisor: Brian Slator. Upsilon Pi Epsilon President: Deovandski ...",
"htmlSnippet": "Sep 16, 2015 \u003cb\u003e...\u003c/b\u003e Association for Computing Machinery Student Chapter Chair: Jordan Goetze \u003cbr\u003e\nAdvisor: Brian Slator. Upsilon Pi Epsilon President: \u003cb\u003eDeovandski\u003c/b\u003e ...",
"cacheId": "pyzF9XJwrXsJ",
"formattedUrl": "https://www.ndsu.edu/scimath/currentstudents/student_experience/",
"htmlFormattedUrl": "https://www.ndsu.edu/scimath/currentstudents/student_experience/",
"pagemap": {
"cse_image": [
{
"src": "https://www.ndsu.edu/fileadmin/_processed_/csm_080117_anatomy_03med_9dbc3c8cce.jpg"
}
],
"cse_thumbnail": [
{
"width": "184",
"height": "275",
"src": "https://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcTTL-GZRfSv30cyESsCnd_65BFoLMDdo8fqNS58mHfRbGiOTjSq-e-o28FE"
}
]
}
},