python beautifulsoup4 selenium ChromeDriverManager [网页抓取直到最后才起作用]

python beautifulsoup4 selenium ChromeDriverManager [Webpage crawling is not working until the end]

我正在尝试从网页上获取硬币名称列表。 我试过喝汤,但由于某些原因没用。 并且还尝试使用硒。 :( 但也不起作用。

那个网站有什么问题? (我发现 javascript & DOM 的问题?但是没看明白。。) 我可以得到一些帮助以从网上获取所有列表吗? (我使用 Chrome 驱动程序管理器来避免一些错误)

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from bs4 import BeautifulSoup
from selenium.webdriver.common.keys import Keys
import time

options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-logging"])
driver = webdriver.Chrome(ChromeDriverManager().install(),options=options)

html = driver.get('https://coinmarketcap.com/')
html = driver.page_source

driver.execute_script("window.scrollTo(0, document.body.scrollHeight)")
driver.maximize_window()
driver.implicitly_wait(10)

soup = BeautifulSoup(html, 'html.parser')

status_today = soup.find_all('div',{'class':'sc-16r8icm-0 escjiH'},'href')

for x in status_today:
    print('x.a[href]=',x.a['href'])

结果只有10行,有100个硬币列表...

x.a[href]= /currencies/bitcoin/
x.a[href]= /currencies/ethereum/
x.a[href]= /currencies/binance-coin/
x.a[href]= /currencies/cardano/
x.a[href]= /currencies/tether/
x.a[href]= /currencies/xrp/
x.a[href]= /currencies/solana/
x.a[href]= /currencies/polkadot-new/
x.a[href]= /currencies/usd-coin/
x.a[href]= /currencies/dogecoin/

您需要滚动到每个元素,然后您可以从锚标记中提取 href

还要确保使用 Explicit waits

我们正在使用的 xpath //tbody//tr 带有索引。

代码:

driver = webdriver.Chrome(driver_path)
driver.maximize_window()
driver.implicitly_wait(30)
wait = WebDriverWait(driver, 30)

driver.get("https://coinmarketcap.com/")

j = 1
while True:
    try:
        row  = wait.until(EC.visibility_of_element_located((By.XPATH, f"(//tbody//tr)[{j}]")))
        driver.execute_script("arguments[0].scrollIntoView(true);", row)
        href = row.find_element_by_xpath(".//descendant::div[@class='sc-16r8icm-0 escjiH']//a").get_attribute('href')
        print(href)
        j = j +1
    except:
        break

进口:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

输出:

https://coinmarketcap.com/currencies/bitcoin/
https://coinmarketcap.com/currencies/ethereum/
https://coinmarketcap.com/currencies/binance-coin/
https://coinmarketcap.com/currencies/cardano/
https://coinmarketcap.com/currencies/tether/
https://coinmarketcap.com/currencies/xrp/
https://coinmarketcap.com/currencies/solana/
https://coinmarketcap.com/currencies/polkadot-new/
https://coinmarketcap.com/currencies/usd-coin/
https://coinmarketcap.com/currencies/dogecoin/
https://coinmarketcap.com/currencies/terra-luna/
https://coinmarketcap.com/currencies/uniswap/
https://coinmarketcap.com/currencies/binance-usd/
https://coinmarketcap.com/currencies/avalanche/
https://coinmarketcap.com/currencies/litecoin/
https://coinmarketcap.com/currencies/wrapped-bitcoin/
https://coinmarketcap.com/currencies/shiba-inu/
https://coinmarketcap.com/currencies/chainlink/
https://coinmarketcap.com/currencies/bitcoin-cash/
https://coinmarketcap.com/currencies/algorand/
https://coinmarketcap.com/currencies/polygon/
https://coinmarketcap.com/currencies/stellar/
https://coinmarketcap.com/currencies/filecoin/
https://coinmarketcap.com/currencies/cosmos/
https://coinmarketcap.com/currencies/internet-computer/
https://coinmarketcap.com/currencies/axie-infinity/
https://coinmarketcap.com/currencies/vechain/
https://coinmarketcap.com/currencies/ethereum-classic/
https://coinmarketcap.com/currencies/tron/
https://coinmarketcap.com/currencies/multi-collateral-dai/
https://coinmarketcap.com/currencies/ftx-token/
https://coinmarketcap.com/currencies/tezos/
https://coinmarketcap.com/currencies/theta/
https://coinmarketcap.com/currencies/bitcoin-bep2/
https://coinmarketcap.com/currencies/fantom/
https://coinmarketcap.com/currencies/hedera/
https://coinmarketcap.com/currencies/monero/
https://coinmarketcap.com/currencies/pancakeswap/
https://coinmarketcap.com/currencies/crypto-com-coin/
https://coinmarketcap.com/currencies/elrond-egld/
https://coinmarketcap.com/currencies/eos/
https://coinmarketcap.com/currencies/ecash/
https://coinmarketcap.com/currencies/klaytn/
https://coinmarketcap.com/currencies/aave/
https://coinmarketcap.com/currencies/iota/
https://coinmarketcap.com/currencies/near-protocol/
https://coinmarketcap.com/currencies/quant/
https://coinmarketcap.com/currencies/bitcoin-sv/
https://coinmarketcap.com/currencies/the-graph/
https://coinmarketcap.com/currencies/neo/
https://coinmarketcap.com/currencies/waves/
https://coinmarketcap.com/currencies/stacks/
https://coinmarketcap.com/currencies/kusama/
https://coinmarketcap.com/currencies/terrausd/
https://coinmarketcap.com/currencies/harmony/
https://coinmarketcap.com/currencies/unus-sed-leo/
https://coinmarketcap.com/currencies/bittorrent/
https://coinmarketcap.com/currencies/maker/
https://coinmarketcap.com/currencies/omg/
https://coinmarketcap.com/currencies/amp/
https://coinmarketcap.com/currencies/helium/
https://coinmarketcap.com/currencies/celo/
https://coinmarketcap.com/currencies/dash/
https://coinmarketcap.com/currencies/chiliz/
https://coinmarketcap.com/currencies/arweave/
https://coinmarketcap.com/currencies/compound/
https://coinmarketcap.com/currencies/decred/
https://coinmarketcap.com/currencies/thorchain/
https://coinmarketcap.com/currencies/revain/
https://coinmarketcap.com/currencies/holo/
https://coinmarketcap.com/currencies/nem/
https://coinmarketcap.com/currencies/theta-fuel/
https://coinmarketcap.com/currencies/zcash/
https://coinmarketcap.com/currencies/xinfin/
https://coinmarketcap.com/currencies/icon/
https://coinmarketcap.com/currencies/decentraland/
https://coinmarketcap.com/currencies/celsius/
https://coinmarketcap.com/currencies/qtum/
https://coinmarketcap.com/currencies/trueusd/
https://coinmarketcap.com/currencies/enjin-coin/
https://coinmarketcap.com/currencies/sushiswap/
https://coinmarketcap.com/currencies/yearn-finance/
https://coinmarketcap.com/currencies/dydx/
https://coinmarketcap.com/currencies/bitcoin-gold/
https://coinmarketcap.com/currencies/huobi-token/
https://coinmarketcap.com/currencies/curve-dao-token/
https://coinmarketcap.com/currencies/flow/
https://coinmarketcap.com/currencies/mina/
https://coinmarketcap.com/currencies/mdex/
https://coinmarketcap.com/currencies/zilliqa/
https://coinmarketcap.com/currencies/synthetix-network-token/
https://coinmarketcap.com/currencies/ravencoin/
https://coinmarketcap.com/currencies/perpetual-protocol/
https://coinmarketcap.com/currencies/basic-attention-token/
https://coinmarketcap.com/currencies/ren/
https://coinmarketcap.com/currencies/serum/
https://coinmarketcap.com/currencies/renbtc/
https://coinmarketcap.com/currencies/okb/
https://coinmarketcap.com/currencies/iostoken/
https://coinmarketcap.com/currencies/telcoin/