如何使用 Selenium 抓取实时股票报价 Python

How to scrape the real time stock price quote using Selenium Python

我需要编写一个小应用程序来为我的学校项目获取 Money18.on.cc 的实时报价。我的目标是获得所有显示的数据。我已经尝试过带有代码 ["700"、"3690"、"1"] 列表的 Selenium,但结果看起来很奇怪,例如有时所有 3 个结果都相同或相应价格的顺序错误。这是我的代码:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

webdriverPath = getFileContentItem("all_file_path.txt", "chromeWebdriverDir")
s = Service(webdriverPath)
driver = webdriver.Chrome(service = s)
driver.implicitly_wait(10)
driver.set_page_load_timeout(20)
print("Successful open browser...")


tickerList = ["700", "3690", "1"]
money18Link = "https://money18.on.cc/eng/info/hk/liveinfo_quote_00700.html"
print(f"money18Link:\n{money18Link}")
driver.get(money18Link)
driver.find_element(By.XPATH, '//div[@class="industryClose"]/img').click() # remove the AD from the main window

for ticker in tickerList:
    driver.get(money18Link)
    inputElem = driver.find_element(By.CSS_SELECTOR, '[class="stockInput stock-number"]')
    inputElem = myElem = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CSS_SELECTOR, '[class="stockInput stock-number"]')))
    inputElem.click()
    inputElem.clear()
    inputElem.send_keys(ticker)
    inputElem.send_keys(Keys.ENTER)

    print(f"ticker: {ticker} is running...")
    driver.switch_to.default_content()
    curPx = driver.find_element(By.XPATH, '//div[@class="section-body"]/div[4]/div[1]/div/div/div[1]/span[@class="value"]').text
    print(f"curPx: {curPx}")

下面的几种结果情况看起来很奇怪,报废的股票价格与其代码不对应:

Scen 1:
ticker: 700 is running...
curPx: 374.400
ticker: 3690 is running...
curPx: 374.400
ticker: 1 is running...
curPx: 374.400

Scen 2:
ticker: 700 is running...
curPx: 374.400
ticker: 3690 is running...
curPx: 374.400
ticker: 1 is running...
curPx: 155.200

您不需要使用 selenium,因为有 JSON API.

看看这个例子:

import time

import requests

headers = {
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36 OPR/83.0.4254.62',
}

api_url = 'https://realtime-money18-cdn.on.cc/securityQuote/genStockDetailHKJSON.php'

stockcodes = ['700', '3690', '1']

while True:
    for code in stockcodes:
        params = {'stockcode': code}

        r = requests.get(api_url, headers=headers, params=params)
        data = r.json()
        price = data['real']['ltp']
        print(f'Price for code {code} = {price}')

    print()
    time.sleep(5)

这是输出:

Price for code 700 = 374.400
Price for code 3690 = 155.200
Price for code 1 = 56.750

要打印 stock_numberstock_value 您可以使用以下 :

  • 使用CSS_SELECTOR:

    driver.get("https://money18.on.cc/eng/info/hk/liveinfo_quote_00700.html")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.industryClose>img"))).click()
    stock_number = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CSS_SELECTOR, "input.stockInput.stock-number"))).get_attribute("value")
    stock_value = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CSS_SELECTOR, "div.text-right.stock-price span.value"))).text
    print(f"{stock_number} is {stock_value}")
    
  • 使用 XPATH:

    driver.get("https://money18.on.cc/eng/info/hk/liveinfo_quote_00700.html")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@class='industryClose']/img"))).click()
    stock_number = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.XPATH, "//input[@class='stockInput stock-number']"))).get_attribute("value")
    stock_value = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.XPATH, "//div[@class='text-right stock-price']//span[@class='value']"))).text
    print(f"{stock_number} is {stock_value}")
    
  • 控制台输出:

    00700 is 374.400
    
  • 注意:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC