在 Scrapy 中使用字体或颜色抓取网站
Scrape a website using font or colors in Scrapy
我需要从网站上抓取价格,我 运行 遇到了一个问题,即某些价格被划掉,新价格显示在 red/bold 个字母和 html 代码中该代码不同,因此我的价格为空。所以我决定做一个 if 语句来获取正确的数据,但唯一的问题是,划掉的价格具有相同的标识符,所以我得到的是那个价格而不是红色的价格。那么在Scrapy中有没有一种方法可以根据颜色为红色或字体为粗体来获取我需要的价格?如果没有,我可以通过其他方式获得合适的价格吗?
for game in response.css("tr[class^=deckdbbody]"):
# Initialize saved_name to the extracted card name
saved_name = game.css("a.card_popup::text").extract_first() or saved_name
# Now call item and set equal to saved_name and strip leading '\n' from output
item["Card_Name"] = saved_name.strip()
# Check to see if output is null, in the case that there are two different conditions for one card
if item["Card_Name"] != None:
# If not null than store value in saved_name
saved_name = item["Card_Name"].strip()
# If null then set null value to previous card name since if there is a null value you should have the same card name twice
else:
item["Card_Name"] = saved_name
# Call item again in order to extract the condition, stock, and price using the corresponding html code from the website
item["Condition"] = game.css("td[class^=deckdbbody].search_results_7 a::text").get()
item["Stock"] = game.css("td[class^=deckdbbody].search_results_8::text").extract_first()
item["Price"] = game.css("td[class^=deckdbbody].search_results_9::text").extract_first()
if item["Price"] == None:
item["Price"] = game.css("td[class^=deckdbbody].search_results_9 span::text").get()
# Return values
yield item
您可以使用样式属性对其进行过滤
response.css('span[style^="color:red;"]::text').get()
你需要调整表情:
if item["Price"] == None:
item["Price"] = game.css("td[class^=deckdbbody].search_results_9 span[style*='color:red']::text").get()
我需要从网站上抓取价格,我 运行 遇到了一个问题,即某些价格被划掉,新价格显示在 red/bold 个字母和 html 代码中该代码不同,因此我的价格为空。所以我决定做一个 if 语句来获取正确的数据,但唯一的问题是,划掉的价格具有相同的标识符,所以我得到的是那个价格而不是红色的价格。那么在Scrapy中有没有一种方法可以根据颜色为红色或字体为粗体来获取我需要的价格?如果没有,我可以通过其他方式获得合适的价格吗?
for game in response.css("tr[class^=deckdbbody]"):
# Initialize saved_name to the extracted card name
saved_name = game.css("a.card_popup::text").extract_first() or saved_name
# Now call item and set equal to saved_name and strip leading '\n' from output
item["Card_Name"] = saved_name.strip()
# Check to see if output is null, in the case that there are two different conditions for one card
if item["Card_Name"] != None:
# If not null than store value in saved_name
saved_name = item["Card_Name"].strip()
# If null then set null value to previous card name since if there is a null value you should have the same card name twice
else:
item["Card_Name"] = saved_name
# Call item again in order to extract the condition, stock, and price using the corresponding html code from the website
item["Condition"] = game.css("td[class^=deckdbbody].search_results_7 a::text").get()
item["Stock"] = game.css("td[class^=deckdbbody].search_results_8::text").extract_first()
item["Price"] = game.css("td[class^=deckdbbody].search_results_9::text").extract_first()
if item["Price"] == None:
item["Price"] = game.css("td[class^=deckdbbody].search_results_9 span::text").get()
# Return values
yield item
您可以使用样式属性对其进行过滤
response.css('span[style^="color:red;"]::text').get()
你需要调整表情:
if item["Price"] == None:
item["Price"] = game.css("td[class^=deckdbbody].search_results_9 span[style*='color:red']::text").get()