如何使用 python 监控迪卡侬产品的可用性?

how to monitor availability Decathlon's products with python?

我有一个请求要给你。

我想抓取以下产品https://www.decathlon.it/p/kit-manubri-e-bilanciere-bodybuilding-93kg/_/R-p-10804?mc=4687932&c=NERO#

产品有两种可能的状态:

  1. “不可动摇的态度”
  2. “可分配”

简而言之shell 我想创建一个脚本来监控产品是否可用的所有分钟,并在 shell.

中记录所有数据

输出可能如下:

28/03/2021 12:07 - Attualmente Indisponibile
28/03/2021 12:08 - Attualmente Indisponibile
28/03/2021 12:09 - Disponibile 

python可以吗?有人可以帮我写代码吗? 我无法使用“请求”补丁或其他网络抓取工具 python,但我想学习。

我试过以下代码:

import requests
import re

urls = ['p/kit-manubri-e-bilanciere-bodybuilding-93kg/_/R-p-10804.html']


def main(site):
    with requests.Session() as req:
        for url in urls:
            r = req.get(site.format(url))
            match = re.search('availability.+org\/(.*?)"', r.text)
            print("url: {:<70}, status: {}".format(r.url, match.group(1)))


main("https://www.decathlon.it/{}")

但给我以下错误:

AttributeError: 'NoneType' object has no attribute 'group'

试试这个:

import requests
import re
import time

urls = ['p/kit-manubri-e-bilanciere-bodybuilding-93kg/_/R-p-10804.html']
user_agent = {'User-agent': 'Mozilla/5.0'}

def main(site):
    with requests.Session() as req:
        for url in urls:
            r = req.get(site.format(url), headers=user_agent)
            match = re.search('availability.+org\/(.*?)"', r.text)
            print("url: {:<70}, status: {}".format(r.url, match.group(1)))

while True:
    main("https://www.decathlon.it/{}")
    time.sleep(60)

输出:

url: https://www.decathlon.it/p/kit-manubri-e-bilanciere-bodybuilding-93kg/_/R-p-10804, status: OutOfStock