如何每 10 分钟将数据从 Web 写入 CSV 文件

Question

您好，我对 Python 和一般的网络抓取还比较陌生，但我正在尝试从网站获取数据值，并将其写入 CSV 文件。这对我来说也很好用。我的问题是我希望脚本每小时获取一次值并将其存储在 CSV 文件中。所以我在调度命令上做错了，因为获取值并将其写入 CSV 文件效果很好，但仅当我按运行时。这是我试过的代码。

import urllib2
from bs4 import BeautifulSoup
import csv
from datetime import datetime
import os
import schedule
import time


def job():

url = 'https://coinmarketcap.com/currencies/bitcoin-cash/'

page = urllib2.urlopen(url)

soup = BeautifulSoup(page, 'html.parser')

name_box = soup.find('span', attrs={'class': 'text-large2'})

bch_value = float(name_box.text.strip())

os.chdir('C:\Users\NIK\.spyder2\PythonScripts')

with open('BCH_kurs', 'a') as csv_file:
writer = csv.writer(csv_file)
writer.writerow([bch_value, datetime.now()])

schedule.every(1).minutes.do(job)
schedule.every().hour.do(job)
schedule.every().day.at("10:30").do(job)
schedule.every(5).to(10).minutes.do(job)
schedule.every().monday.do(job)
schedule.every().wednesday.at("13:15").do(job)

while True:
schedule.run_pending()
time.sleep(1)

Answer 1

我建议您探索 scrapy 框架。这里有一个simple example

您可以保存为您想要的任何格式，还可以按固定时间间隔自动执行运行抓取。

Answer 2

时间表是

in-process scheduler for periodic jobs ( https://pypi.python.org/pypi/schedule )

所以在进程中安排运行s。要开始这个过程，你必须使用运行并在那个时间表运行s ...

中开始这个过程

如何每 10 分钟将数据从 Web 写入 CSV 文件

How to write data from the web to a CSV file every 10 min

python

schedule