如何使用 Python 从 poocoin.app 抓取时间序列图表数据
How to scrape time-series chart data from poocoin.app with Python
我正在尝试抓取 token info from poocoin。所有其他信息都可用,但我无法从图表中抓取时间序列数据。
import requests, re
from bs4 import BeautifulSoup
import pandas as pd
url = 'https://poocoin.app/tokens/0x7606267a4bfff2c5010c92924348c3e4221955f2'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
更新:新方法是对 data
URL 参数进行逆向工程:
当我有解决方案时,我会更新答案。
以前的方法:
您可以通过直接向他们的 API 发出请求使其工作(我相信),通过 [=13= 将其转换为 JSON ] .json()
decoder,然后像访问字典一样获取所需的数据:["some_key"]
.
找到发送请求的位置:Dev tools -> Network -> Fetch/XHR -> find name and click on it
(在本例中:candles-bsc?..)-> Preview
(查看响应是否为你想要什么)-> Headers -> copy Request URL -> make a request -> optional: add additional request headers if response != 200
.
您可以使用 Insomnia to test a response。在 Fetch/XHR -> right click -> copy as cURL (bash) -> place inside Insomnia -> see the reponse
.
下查找名称
在这种情况下,你只需要传递一个user-agent
to request headers
in order to receive a 200
status code, otherwise, it will throw a 403
or 503
status code. Check what's your user-agent
。
通过 user-agent
:
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36",
}
response = requests.get("URL", headers=headers)
import requests
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36",
}
params = {
"to":"2021-11-29T09:15:00.000Z",
"limit":"321",
"lpAddress":"0xd8b6A853095c334aD26621A301379Cc3614f9663",
"interval":"15m",
"baseLp":"0x58F876857a02D6762E0101bb5C46A8c1ED44Dc16"
}
response = requests.get("https://api2.poocoin.app/candles-bsc", params=params, headers=headers).json()
# whole response from API call for a particular token (i believe)
# some data needs to be adjusted (open/close price, etc.)
for result in response:
count = result["count"]
_time = result["time"]
open_price = result["open"]
close_price = result["close"]
high = result["high"]
low = result["low"]
volume = result["volume"]
base_open = result["baseOpen"]
base_close = result["baseClose"]
base_high = result["baseHigh"]
base_low = result["baseLow"]
print(f"{count}\n"
f"{_time}\n"
f"{open_price}\n"
f"{close_price}\n"
f"{high}\n"
f"{low}\n"
f"{volume}\n"
f"{base_open}\n"
f"{base_close}\n"
f"{base_high}\n"
f"{base_low}\n")
# part of the output:
'''
194
2021-11-29T06:00:00.000Z
6.6637177e-13
6.5189422e-13
6.9088173e-13
5.9996067e-13
109146241968737.17
610.0766516756873
611.1764494818917
612.3961994618185
606.7446709385977
1
2021-11-25T16:15:00.000Z
1.7132448e-13
1.7132448e-13
1.7132448e-13
1.7132448e-13
874858231833.1771
643.611707269882
642.5014860521045
644.5105804619558
638.9447353699617
# ...
'''
我正在尝试抓取 token info from poocoin。所有其他信息都可用,但我无法从图表中抓取时间序列数据。
import requests, re
from bs4 import BeautifulSoup
import pandas as pd
url = 'https://poocoin.app/tokens/0x7606267a4bfff2c5010c92924348c3e4221955f2'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
更新:新方法是对 data
URL 参数进行逆向工程:
当我有解决方案时,我会更新答案。
以前的方法:
您可以通过直接向他们的 API 发出请求使其工作(我相信),通过 [=13= 将其转换为 JSON ] .json()
decoder,然后像访问字典一样获取所需的数据:["some_key"]
.
找到发送请求的位置:Dev tools -> Network -> Fetch/XHR -> find name and click on it
(在本例中:candles-bsc?..)-> Preview
(查看响应是否为你想要什么)-> Headers -> copy Request URL -> make a request -> optional: add additional request headers if response != 200
.
您可以使用 Insomnia to test a response。在 Fetch/XHR -> right click -> copy as cURL (bash) -> place inside Insomnia -> see the reponse
.
在这种情况下,你只需要传递一个user-agent
to request headers
in order to receive a 200
status code, otherwise, it will throw a 403
or 503
status code. Check what's your user-agent
。
通过 user-agent
:
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36",
}
response = requests.get("URL", headers=headers)
import requests
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36",
}
params = {
"to":"2021-11-29T09:15:00.000Z",
"limit":"321",
"lpAddress":"0xd8b6A853095c334aD26621A301379Cc3614f9663",
"interval":"15m",
"baseLp":"0x58F876857a02D6762E0101bb5C46A8c1ED44Dc16"
}
response = requests.get("https://api2.poocoin.app/candles-bsc", params=params, headers=headers).json()
# whole response from API call for a particular token (i believe)
# some data needs to be adjusted (open/close price, etc.)
for result in response:
count = result["count"]
_time = result["time"]
open_price = result["open"]
close_price = result["close"]
high = result["high"]
low = result["low"]
volume = result["volume"]
base_open = result["baseOpen"]
base_close = result["baseClose"]
base_high = result["baseHigh"]
base_low = result["baseLow"]
print(f"{count}\n"
f"{_time}\n"
f"{open_price}\n"
f"{close_price}\n"
f"{high}\n"
f"{low}\n"
f"{volume}\n"
f"{base_open}\n"
f"{base_close}\n"
f"{base_high}\n"
f"{base_low}\n")
# part of the output:
'''
194
2021-11-29T06:00:00.000Z
6.6637177e-13
6.5189422e-13
6.9088173e-13
5.9996067e-13
109146241968737.17
610.0766516756873
611.1764494818917
612.3961994618185
606.7446709385977
1
2021-11-25T16:15:00.000Z
1.7132448e-13
1.7132448e-13
1.7132448e-13
1.7132448e-13
874858231833.1771
643.611707269882
642.5014860521045
644.5105804619558
638.9447353699617
# ...
'''