Scrapy - Web 抓取从加密交换站点返回空列表
Scrapy - Web scraping returning empty list from crypto exchange site
我通常在网络抓取时都能成功,但在这个方面遇到了麻烦。我假设我被屏蔽了,或者他们有安全措施。
我正在尝试从 https://www.okcoin.com/earn 获取 APY 费率。您不需要帐户即可登录,而且应该相当简单,所以这是我的代码(我提供的 xpath 是 下的 tableOther offers):
from requests_html import HTMLSession
from scrapy import Selector
def url_headers(url):
"""
Creates headers for each url to be scraped.
param url: webpage for scraping
:return:
"""
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:68.0) Gecko/20100101 Firefox/68.0',
'Accept': '*/*',
'Accept-Language': 'en-US,en;q=0.5',
'Accept-Encoding': 'gzip, deflate',
'Referer': url
}
# Get HTML version of document
sess = HTMLSession()
res = sess.get(url, headers=headers)
# Convert to text and extract whole document
selector = Selector(text=res.content)
return selector
sel = url_headers("https://www.okcoin.com/earn")
apy = sel.xpath(
'/html/body/div[2]/div/div/div/div[2]/div/div/div[3]/div/div[3]'
).extract()
print(apy)
数据也从 api 调用 json 响应生成。所以你只能使用 requests
import requests
api_url = 'https://www.okcoin.com/v2/asset/outer/earn/project-currency?t=1646119369275'
resp = requests.get(api_url).json()
for item in resp['data']:
print(item['rate'])
输出:
265.00%
145.00%
21.41%
21.33%
12.55%
10.00%
9.07%
7.94%
6.22%
5.40%
2.31%
3.19%
1.44%
1.83%
1.36%
1.36%
1.29%
1.05%
0.00%
0.00%
我通常在网络抓取时都能成功,但在这个方面遇到了麻烦。我假设我被屏蔽了,或者他们有安全措施。
我正在尝试从 https://www.okcoin.com/earn 获取 APY 费率。您不需要帐户即可登录,而且应该相当简单,所以这是我的代码(我提供的 xpath 是 下的 tableOther offers):
from requests_html import HTMLSession
from scrapy import Selector
def url_headers(url):
"""
Creates headers for each url to be scraped.
param url: webpage for scraping
:return:
"""
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:68.0) Gecko/20100101 Firefox/68.0',
'Accept': '*/*',
'Accept-Language': 'en-US,en;q=0.5',
'Accept-Encoding': 'gzip, deflate',
'Referer': url
}
# Get HTML version of document
sess = HTMLSession()
res = sess.get(url, headers=headers)
# Convert to text and extract whole document
selector = Selector(text=res.content)
return selector
sel = url_headers("https://www.okcoin.com/earn")
apy = sel.xpath(
'/html/body/div[2]/div/div/div/div[2]/div/div/div[3]/div/div[3]'
).extract()
print(apy)
数据也从 api 调用 json 响应生成。所以你只能使用 requests
import requests
api_url = 'https://www.okcoin.com/v2/asset/outer/earn/project-currency?t=1646119369275'
resp = requests.get(api_url).json()
for item in resp['data']:
print(item['rate'])
输出:
265.00%
145.00%
21.41%
21.33%
12.55%
10.00%
9.07%
7.94%
6.22%
5.40%
2.31%
3.19%
1.44%
1.83%
1.36%
1.36%
1.29%
1.05%
0.00%
0.00%