如何在 scrapy 中验证 Yelp API?传递 Secret_Token 和搜索参数?

How to authenticate Yelp API in scrapy? Pass Secret_Token and Search params?

下面是我在 Scrapy 日志中抛出 400 错误的代码。这段代码背后的逻辑如下 - 1) 我使用 post 请求来获取我的 Secret_Token。 2) 我将 header 设置为使用秘密令牌并为 API 搜索字符串定义参数。此外,我认为 header 和 Secret_token 应该作为元数据传递给进一步的请求。 3) 在这里,我希望 Parse 函数从 #2 中的 Request 接收 json 响应并将其解析为项目。在 Parse 方法内部循环之后,带有准备就绪和工作请求 #2 的参数列表。

问题是它不起作用)附加日志。我想知道我是否正确传递参数和秘密令牌以及如何在 meta 中传递秘密令牌?

import scrapy
import json
import requests
import pprint



class YelpSpider(scrapy.Spider):
    name = "yelp"
    allowed_domains = ["https://api.yelp.com"]

    def start_requests(self):
        params = {
            'grant_type': 'client_credentials',
            'client_id': '*******',
            'client_secret' : '*******'
        }  

        request = requests.post('https://api.yelp.com/oauth2/token', params = params)



        bearer_token = request.json()['access_token']
        headers = {'Authorization' : 'Bearer %s' % bearer_token}

        params = {
                    'term': 'restaurant',
                    'offset': 20,
                    'cc' : 'AU',
                    'location': 4806
                }

        yield scrapy.Request('https://api.yelp.com/v3/businesses/search', headers = headers, cookies = params, callback= self.parse)







    def parse(self, response):
        item = response.json()['businesses']
        return item

是的,你完全可以用 scrapy 来做到这一点,但它不会使用 python 库作为 API 客户端,而是你需要直接请求,如 yeld developers documentation.

中指定

下面是 Yelp Fusion API 与 Scrapy 的完整功能代码。我尚未实现基于 postal 代码和偏移量参数的 url 生成逻辑以显示最多 1000 个条目。加上实施项目。如果您对如何改进代码有一些建议,请 post 您发表评论。

P.S。顺便说一下,Fusion API 已将显示结果的限制增加到 50。所以现在您可以使用 'limit' : 50, 'offset': 50,

# -*- coding: utf-8 -*-
import scrapy
import json

import urllib

class YelpSpider(scrapy.Spider):


 name = "yelp"


    def start_requests(self):

        # as per Yelp docs we pass personal info as POST to get access_token
        # here a pass it to different function as do not know how to to all in one
        params = {
            'grant_type': 'client_credentials',
            'client_id': '**********',
            'client_secret' : '************'
        }  

        yield scrapy.Request(url='https://api.yelp.com/oauth2/token',  method="POST", body=urllib.urlencode(params))




    def parse(self, response):

        # revoke access token from response object. and set Header according to Yelp docs.
        bearer_token = json.loads(response.body)['access_token']
        headers = {'Authorization' : 'Bearer %s' % bearer_token}

        # set search parameters
        params = {
                    'term': 'restaurant',
                    'offset': 20,
                    'cc' : 'AU',
                    'location': 4806
                }
        # base search URL for Fusion API
        url = "https://api.yelp.com/v3/businesses/search"

        # form Get request to recieve final info as JSON. Unfortunately I did not find appropriate 
        # method to pass params in Scrapy other then shown below.

        yield scrapy.Request(url= url + '?' + urllib.urlencode(params), method="GET", headers=headers, callback=self.parse_items)




    def parse_items(self, response):

        # parse needed items.

        resp = json.loads(response.body)['businesses']
        print resp