Scrapy - Web 抓取从加密交换站点返回空列表

Scrapy - Web scraping returning empty list from crypto exchange site

我通常在网络抓取时都能成功,但在这个方面遇到了麻烦。我假设我被屏蔽了,或者他们有安全措施。

我正在尝试从 https://www.okcoin.com/earn 获取 APY 费率。您不需要帐户即可登录,而且应该相当简单,所以这是我的代码(我提供的 xpath 是 下的 tableOther offers):

from requests_html import HTMLSession
from scrapy import Selector

def url_headers(url):
    """
    Creates headers for each url to be scraped.

    param url: webpage for scraping
    :return:
    """

    headers = {
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:68.0) Gecko/20100101 Firefox/68.0',
        'Accept': '*/*',
        'Accept-Language': 'en-US,en;q=0.5',
        'Accept-Encoding': 'gzip, deflate',
        'Referer': url
    }

    # Get HTML version of document
    sess = HTMLSession()
    res = sess.get(url, headers=headers)

    # Convert to text and extract whole document
    selector = Selector(text=res.content)
    return selector

sel = url_headers("https://www.okcoin.com/earn")

apy = sel.xpath(
    '/html/body/div[2]/div/div/div/div[2]/div/div/div[3]/div/div[3]'
).extract()

print(apy)

数据也从 api 调用 json 响应生成。所以你只能使用 requests

    import requests
    
    api_url = 'https://www.okcoin.com/v2/asset/outer/earn/project-currency?t=1646119369275'
    
    resp = requests.get(api_url).json()
    
    for item in resp['data']:
        print(item['rate'])

输出:

265.00%
145.00%
21.41%
21.33%
12.55%
10.00%
9.07%
7.94%
6.22%
5.40%
2.31%
3.19%
1.44%
1.83%
1.36%
1.36%
1.29%
1.05%
0.00%
0.00%