使用 Selenium 获取亚马逊商品排名

Get Amazon Items Rank using Selenium

我想使用以下 Chrome 扩展获取页面上所有亚马逊商品的 排名DS Amazon Quick View

为此,我使用 selenium 登录我的 chrome 配置文件(已安装扩展)并尝试抓取排名 html 信息。但是,“find_all”return 是一个空对象:

from selenium import webdriver
from bs4 import BeautifulSoup    

options = webdriver.ChromeOptions()
options.add_argument(r"--user-data-dir=C:\Users\Edo\AppData\Local\Google\Chrome\User Data") # get your owm chrome local directory
options.add_argument(r'--profile-directory=Default') 
driver = webdriver.Chrome(executable_path=r"C:\Program Files (x86)\chromedriver.exe", options=options) #get your own exe directory
driver.get("https://www.amazon.com/Best-Sellers-Kindle-Store/zgbs/digital-text/ref=zg_bs_unv_digital-text_1_154606011_1")
soup = BeautifulSoup(driver.page_source.encode('utf-8').strip(), 'html.parser')
print(soup.find_all("div", {"class":"xtaqv-result"})))

>>> 0

本页有 4 个内容:

  1. 当您向下滚动时,项目会加载(最初它只显示 30 个项目)
  2. 项目排名也滚动加载
  3. 如果我们想从其他页面获取项目,则需要分页
  4. 正确的定位器(Xpath、CSS 等)

因此在我们的代码中,如果我们不等待 page/rankings 完全加载,我们将不会获取值。

下面的代码重新调整所有可用页面的所有名称和排名详细信息(在本例中只有 2 个):

进口:

from time import sleep
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By



options = webdriver.ChromeOptions()
options.add_argument(r"--user-data- 
dir=C:\Users\username\AppData\Local\Google\Chrome\User Data")
options.add_argument(r'--profile-directory=Default')
PATH = r"path to your\chromedriver.exe"

driver = webdriver.Chrome(executable_path=r"your chromedriver exe path\chromedriver.exe", options=options)
driver.get(
        "https://www.amazon.com/Best-Sellers-Kindle-Store/zgbs/digital-text/ref=zg_bs_unv_digital-text_1_154606011_1")
    sleep(10)
    
    
    def pagescroll():
        for x in range(9):
            driver.find_element(By.CSS_SELECTOR, 'body').send_keys(Keys.PAGE_DOWN)
            sleep(2)
    
    
    def get_items():
        pagescroll()
        sleep(5)
        allnames = driver.find_elements(By.XPATH, "//*[@id='gridItemRoot']//a//span//div")
        allRanks = driver.find_elements(By.XPATH, "//*[@class='xtaqv-result']")
        for index in range(len(allnames)):
            print(f"--------------------------Item : {index + 1}----------------------------------------")
            print(allnames[index].text)
            print("--------------------------------------------------------------------------------------")
            print(allRanks[index].text)
    
    
    get_items()
    nextElement = driver.find_element(By.CSS_SELECTOR, "div.a-text-center > ul > li.a-last > a")
    counter = 1
    try:
        while nextElement.is_displayed():
            counter = counter + 1
            print("--------------------------------------------------------------------------------------")
            print(f"{counter} : <- page scrapping started")
            print("--------------------------------------------------------------------------------------")
            nextElement.click()
            sleep(5)
            get_items()
    except:
        print("--------------------------------------------------------------------------------------")
        print("There is no more page left with items")
        print("--------------------------------------------------------------------------------------")
    
    driver.quit()


**Output:** Not all items shared as charcaters are going beyond specified limit.
--------------------------Item : 1----------------------------------------
Taste
--------------------------------------------------------------------------------------
#1 in Kindle Store (Top 100)
#1 in Contemporary Romance (Kindle Store)
#1 in Romantic Comedy (Kindle Store)
#1 in Romantic Comedy (Books)
--------------------------Item : 2----------------------------------------
Family Money
--------------------------------------------------------------------------------------
#2 in Kindle Store (Top 100)
#1 in Domestic Thrillers (Kindle Store)
#1 in Psychological Thrillers (Books)
#1 in Literature & Fiction (Kindle Store)
--------------------------Item : 3----------------------------------------
Run, Rose, Run: A Novel
--------------------------------------------------------------------------------------
#3 in Kindle Store (Top 100)
#1 in Southern United States Fiction
#2 in Literature & Fiction (Kindle Store)
#2 in Crime Thrillers (Kindle Store)
--------------------------Item : 4----------------------------------------
Reminders of Him: A Novel
--------------------------------------------------------------------------------------
#4 in Kindle Store (Top 100)
#1 in New Adult & College Romance (Books)
#1 in Mothers & Children Fiction
#1 in Romance (Kindle Store)
--------------------------Item : 5----------------------------------------
The Last Eligible Billionaire
--------------------------------------------------------------------------------------
#5 in Kindle Store (Top 100)
#1 in Billionaire Romance
#2 in Romantic Comedy (Kindle Store)
#2 in Women's Romance Fiction
--------------------------Item : 6----------------------------------------
The Washington Post Digital Access
--------------------------------------------------------------------------------------
#6 in Kindle Store (Top 100)
#1 in eNewspapers
#1 in U.S. Newspapers
--------------------------Item : 7----------------------------------------
Things We Never Got Over
--------------------------------------------------------------------------------------
#7 in Kindle Store (Top 100)
#1 in General Humorous Fiction
#1 in Men, Women & Relationships Humor
#1 in Small Town & Rural Fiction (Kindle Store)
-
-
-
--------------------------Item : 49----------------------------------------
Heir to Love
--------------------------------------------------------------------------------------
#49 in Kindle Store (Top 100)
#8 in New Adult & College Romance (Books)
#23 in Romance (Kindle Store)
--------------------------Item : 50----------------------------------------
America's Last Fortress: Puerto Rico's Sovereignty, China's Caribbean Belt and Road, and America's National Security
--------------------------------------------------------------------------------------
#50 in Kindle Store (Top 100)
#1 in History of the Caribbean & West Indies
#1 in History of Latin America
#1 in International Relations (Kindle Store)
--------------------------------------------------------------------------------------
2 : <- page scrapping started
--------------------------------------------------------------------------------------
--------------------------Item : 1----------------------------------------
Stepbrother Weekend: Filthy Dirty Desires
--------------------------------------------------------------------------------------
#51 in Kindle Store (Top 100)
#1 in Erotic Literature & Fiction
#1 in Erotica (Kindle Store)
--------------------------Item : 2----------------------------------------
Forget-Me-Not Bombshell
--------------------------------------------------------------------------------------
#52 in Kindle Store (Top 100)
#1 in Women's Action & Adventure Fiction
#1 in Organized Crime (Kindle Store)
#2 in Action & Adventure Romance (Kindle Store)
--------------------------Item : 3----------------------------------------
By a Thread: A Grumpy Boss Romantic Comedy
--------------------------------------------------------------------------------------
#53 in Kindle Store (Top 100)
#2 in General Humorous Fiction
#3 in Romance Literary Fiction
#7 in Romantic Comedy (Kindle Store)
--------------------------Item : 4----------------------------------------
How To Start A Conversation And Make Friends: Revised And Updated
--------------------------------------------------------------------------------------
#54 in Kindle Store (Top 100)
#1 in Motivational Self-Help (Kindle Store)
#1 in Running Meetings & Presentations (Kindle Store)
#1 in Healthy Relationships (Kindle Store)
--------------------------Item : 5----------------------------------------
What Lies Beyond the Veil (Of Flesh & Bone Series Book 1)
--------------------------------------------------------------------------------------
#55 in Kindle Store (Top 100)
#1 in Romantic Fantasy (Books)
#1 in Sword & Sorcery Fantasy (Books)
#1 in Greco-Roman Myth & Legend Fantasy eBooks
--------------------------Item : 6----------------------------------------
Verity
--------------------------------------------------------------------------------------
#56 in Kindle Store (Top 100)
#7 in Psychological Thrillers (Kindle Store)
#9 in Psychological Thrillers (Books)
#14 in Romance (Kindle Store)
--------------------------Item : 7----------------------------------------
Mr. Bloomsbury: A feel-good British Billionaire Romance
--------------------------------------------------------------------------------------
#57 in Kindle Store (Top 100)
#1 in Romance Anthologies (Books)
#1 in Romance Collections & Anthologies
#1 in Romance Anthologies (Kindle Store)
--------------------------Item : 8----------------------------------------
Put Me in Detention
--------------------------------------------------------------------------------------
#58 in Kindle Store (Top 100)
#9 in Romantic Comedy (Kindle Store)
#10 in Romantic Comedy (Books)
#15 in Romance (Kindle Store)
--------------------------Item : 9----------------------------------------
A Place Called Freedom
--------------------------------------------------------------------------------------
#59 in Kindle Store (Top 100)
#1 in Espionage Thrillers (Kindle Store)
#1 in Mystery Action Fiction (Kindle Store)
#1 in Historical Scottish Fiction
--------------------------Item : 10----------------------------------------
The Second Home: A Novel
--------------------------------------------------------------------------------------
#60 in Kindle Store (Top 100)
#2 in Sibling Fiction
#4 in Sisters Fiction
#4 in Coming of Age Fiction (Books)
--------------------------Item : 11----------------------------------------
The Last Green Valley: A Novel
--------------------------------------------------------------------------------------
#61 in Kindle Store (Top 100)
#1 in Historical Biographical Fiction
#1 in Biographical Fiction (Books)
#1 in Biographical Literary Fiction
--------------------------Item : 12----------------------------------------
Hidden: An Exciting Novel of Suspense (A Lost and Found Novel Book 1)
--------------------------------------------------------------------------------------
#62 in Kindle Store (Top 100)
#1 in Contemporary
#1 in Thrillers (Kindle Store)
#1 in Heist Thrillers
--------------------------Item : 49----------------------------------------
Sweet (Landry Family Series Book 6)
--------------------------------------------------------------------------------------
#99 in Kindle Store (Top 100)
#4 in Inspirational Romance
#5 in New Adult & College Romance (Kindle Store)
#6 in Women's New Adult & College Fiction
--------------------------Item : 50----------------------------------------
Sold on a Monday: A Novel
--------------------------------------------------------------------------------------
#100 in Kindle Store (Top 100)
#3 in Historical Fiction (Kindle Store)
#3 in Literary Fiction (Kindle Store)
#3 in U.S. Historical Fiction
--------------------------------------------------------------------------------------
There is no more page left with items
--------------------------------------------------------------------------------------