使用循环或函数创建多个数据框

Creating multiple dataframe using loop or function

我正在尝试提取 3 种加密货币的哈希率,我在下面附上了相同的代码。现在,我想传递三个 url,在 return 中我需要三个不同的字典,它们应该有值。我被困住了,我不明白我应该怎么做。我试过使用循环,但它对我不起作用。

url = {'Bitcoin' : 'https://bitinfocharts.com/comparison/bitcoin-hashrate.html#3y', 
       'Ethereum': 'https://bitinfocharts.com/comparison/ethereum-hashrate.html#3y',
       'Litecoin': 'https://bitinfocharts.com/comparison/litecoin-hashrate.html'}

for ele in url:
    
    #### requesting the page and extracting the script which has date and values
    session = requests.Session()
    page = session.get(ele[i])
    soup = BeautifulSoup(page.content, 'html.parser')
    values = str(soup.find_all('script')[4])
    values = values.split('d = new Dygraph(document.getElementById("container"),')[1]

    #create an empty dict to append date and hashrates
    dict([("crypto_1 %s" % i,[]) for i in range(len(url))])

    #run a loop over all the dates and adding to dictionary
    for i in range(values.count('new Date')):
        date = values.split('new Date("')[i+1].split('"')[0]
        value = values.split('"),')[i+1].split(']')[0]
        dict([("crypto_1 %s" % i)[date] = value

您可以使用下一个示例如何从所有 3 个 URL 获取数据并从中创建 dataframe/dictionary:

import re
import requests
import pandas as pd


url = {
    "Bitcoin": "https://bitinfocharts.com/comparison/bitcoin-hashrate.html#3y",
    "Ethereum": "https://bitinfocharts.com/comparison/ethereum-hashrate.html#3y",
    "Litecoin": "https://bitinfocharts.com/comparison/litecoin-hashrate.html",
}


data = []
for name, u in url.items():
    html_doc = requests.get(u).text
    for date, hash_rate in re.findall(
        r'\[new Date\("(.*?)"\),(.*?)\]', html_doc
    ):
        data.append(
            {
                "Name": name,
                "Date": date,
                "Hash Rate": float("nan")
                if hash_rate == "null"
                else float(hash_rate),
            }
        )

df = pd.DataFrame(data)
df["Date"] = pd.to_datetime(df["Date"])
# here save df to CSV

# this will create a dictionary, where the keys are crypto names and values
# are dicts with keys Date/HashRate:
out = {}
for name, g in df.groupby("Name"):
    out[name] = g[["Date", "Hash Rate"]].to_dict(orient="list")

print(out)

打印:

{
    "Bitcoin": {
        "Date": [
            Timestamp("2009-01-03 00:00:00"),
            Timestamp("2009-01-04 00:00:00"),
            Timestamp("2009-01-05 00:00:00"),

...