如何使用 python 监控迪卡侬产品的可用性?
how to monitor availability Decathlon's products with python?
我有一个请求要给你。
我想抓取以下产品https://www.decathlon.it/p/kit-manubri-e-bilanciere-bodybuilding-93kg/_/R-p-10804?mc=4687932&c=NERO#
产品有两种可能的状态:
- “不可动摇的态度”
- “可分配”
简而言之shell 我想创建一个脚本来监控产品是否可用的所有分钟,并在 shell.
中记录所有数据
输出可能如下:
28/03/2021 12:07 - Attualmente Indisponibile
28/03/2021 12:08 - Attualmente Indisponibile
28/03/2021 12:09 - Disponibile
python可以吗?有人可以帮我写代码吗?
我无法使用“请求”补丁或其他网络抓取工具 python,但我想学习。
我试过以下代码:
import requests
import re
urls = ['p/kit-manubri-e-bilanciere-bodybuilding-93kg/_/R-p-10804.html']
def main(site):
with requests.Session() as req:
for url in urls:
r = req.get(site.format(url))
match = re.search('availability.+org\/(.*?)"', r.text)
print("url: {:<70}, status: {}".format(r.url, match.group(1)))
main("https://www.decathlon.it/{}")
但给我以下错误:
AttributeError: 'NoneType' object has no attribute 'group'
试试这个:
import requests
import re
import time
urls = ['p/kit-manubri-e-bilanciere-bodybuilding-93kg/_/R-p-10804.html']
user_agent = {'User-agent': 'Mozilla/5.0'}
def main(site):
with requests.Session() as req:
for url in urls:
r = req.get(site.format(url), headers=user_agent)
match = re.search('availability.+org\/(.*?)"', r.text)
print("url: {:<70}, status: {}".format(r.url, match.group(1)))
while True:
main("https://www.decathlon.it/{}")
time.sleep(60)
输出:
url: https://www.decathlon.it/p/kit-manubri-e-bilanciere-bodybuilding-93kg/_/R-p-10804, status: OutOfStock
我有一个请求要给你。
我想抓取以下产品https://www.decathlon.it/p/kit-manubri-e-bilanciere-bodybuilding-93kg/_/R-p-10804?mc=4687932&c=NERO#
产品有两种可能的状态:
- “不可动摇的态度”
- “可分配”
简而言之shell 我想创建一个脚本来监控产品是否可用的所有分钟,并在 shell.
中记录所有数据输出可能如下:
28/03/2021 12:07 - Attualmente Indisponibile
28/03/2021 12:08 - Attualmente Indisponibile
28/03/2021 12:09 - Disponibile
python可以吗?有人可以帮我写代码吗? 我无法使用“请求”补丁或其他网络抓取工具 python,但我想学习。
我试过以下代码:
import requests
import re
urls = ['p/kit-manubri-e-bilanciere-bodybuilding-93kg/_/R-p-10804.html']
def main(site):
with requests.Session() as req:
for url in urls:
r = req.get(site.format(url))
match = re.search('availability.+org\/(.*?)"', r.text)
print("url: {:<70}, status: {}".format(r.url, match.group(1)))
main("https://www.decathlon.it/{}")
但给我以下错误:
AttributeError: 'NoneType' object has no attribute 'group'
试试这个:
import requests
import re
import time
urls = ['p/kit-manubri-e-bilanciere-bodybuilding-93kg/_/R-p-10804.html']
user_agent = {'User-agent': 'Mozilla/5.0'}
def main(site):
with requests.Session() as req:
for url in urls:
r = req.get(site.format(url), headers=user_agent)
match = re.search('availability.+org\/(.*?)"', r.text)
print("url: {:<70}, status: {}".format(r.url, match.group(1)))
while True:
main("https://www.decathlon.it/{}")
time.sleep(60)
输出:
url: https://www.decathlon.it/p/kit-manubri-e-bilanciere-bodybuilding-93kg/_/R-p-10804, status: OutOfStock