遍历项目列表并从 Web 浏览器中提取列表的数据并将数据框附加为最终输出
Iterate over list of items and extract the data for the list from web browser and append the data frame as final output
我正在尝试从网络浏览器中提取与股票市场相关的数据。我可以打开网络浏览器并提取一只股票的数据。
下面是“One stock”的 python 代码,它使用 Selenium Webdriver 打开 Web 浏览器并使用 Beautifulsoup
从网页中提取数据
这是非常基本的代码,需要进行简化并能够提取如下所示的股票列表数据
stock_list=['Infosys' , 'Reliance industries', 'wipro' ]
我不确定如何提取列表中多个项目的数据,并以此为基础进行简化。
Python 提取一只股票数据的代码。
import requests
import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium.webdriver.support import expected_conditions as EC
headers = {'User-Agent': 'Mozilla/5.0'}
browser = webdriver.Firefox()
browser.get("https://www.tickertape.in/stocks/")
browser.maximize_window()
inputElement=browser.find_element_by_id('search-stock-input')
inputElement.click()
inputElement.send_keys('Infosys')
inputElement.click()
inputElement = wait(browser, 5).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#search-stock-input")))
inputElement.click()
inputElement.send_keys(Keys.RETURN)
page = requests.get(browser.current_url,headers=headers)
from bs4 import BeautifulSoup
soup = BeautifulSoup(page.text, 'html.parser')
ScriptName = []
ScriptName_elem = soup.find_all( class_ = 'jsx-2256451 security-name')
for item in ScriptName_elem:
ScriptName.append(item.text)
intrinsic_value = []
intrinsic_value_elem = soup.find_all( class_ = 'jsx-3277407410 jsx-1058798148 lh-138 text-13 commentary-desc')
for item in intrinsic_value_elem:
intrinsic_value.append(item.text)
Returns_vs_FD_rates = []
Returns_vs_FD_rates_elem = soup.find_all( class_ = 'jsx-3947392323 jsx-1058798148 lh-138 text-13 commentary-desc')
for item in Returns_vs_FD_rates_elem:
Returns_vs_FD_rates.append(item.text)
Divident_Returns = []
Divident_Returns_elem = soup.find_all( class_ = 'jsx-566496888 jsx-1058798148 lh-138 text-13 commentary-desc')
for item in Divident_Returns_elem:
Divident_Returns.append(item.text)
Entry_Point = []
Entry_Point_elem = soup.find_all( class_ = 'jsx-3697483086 jsx-1058798148 lh-138 text-13 commentary-desc')
for item in Entry_Point_elem:
Entry_Point.append(item.text)
Red_Flag_Indicator = []
Red_Flag_Indicator_elem = soup.find_all( class_ = 'jsx-1920835126 jsx-1058798148 relative no-select tooltip-holder')
for item in Red_Flag_Indicator_elem:
Red_Flag_Indicator.append(item.text)
Red_Flag_Indicator_Reason = []
Red_Flag_Indicator_Reason_elem = soup.find_all( class_ = 'jsx-1920835126 jsx-1058798148 lh-138 text-13 commentary-desc')
for item in Red_Flag_Indicator_Reason_elem:
Red_Flag_Indicator_Reason.append(item.text)
df_array = []
for ScriptName_n, intrinsic_value_n,Returns_vs_FD_rates_n,Divident_Returns_n,Entry_Point_n,Red_Flag_Indicator_n,Red_Flag_Indicator_Reason_n in zip(ScriptName,intrinsic_value,Returns_vs_FD_rates,Divident_Returns,Entry_Point,Red_Flag_Indicator,Red_Flag_Indicator_Reason):
df_array.append({'ScriptName': ScriptName_n, 'intrinsic_value': intrinsic_value_n, 'Returns_vs_FD_rates' : Returns_vs_FD_rates_n, 'Divident_Returns' : Divident_Returns_n, 'Entry_Point' : Entry_Point_n,
'Red_Flag_Indicator' : Red_Flag_Indicator_n , 'Red_Flag_Indicator_Reason' : Red_Flag_Indicator_Reason_n })
df = pd.DataFrame(df_array)
df
提前致谢
您可以调用与页面相同的 APIs。第一个 API 获取 id 和证券名称,供股票用于第二个 API returns 那些清单项目。
如果您创建一个字典列表,每个代码一个字典,您可以在最后转换为数据框。如果我错过了一件物品,请告诉我。我还选择存储许多其他数据,例如另一个名为 other_data
.
的字典中的低、高等
import requests
import pandas as pd
other_data = {}
results = []
stock_list = ['Infosys', 'Reliance industries', 'wipro']
with requests.Session() as s:
for ticker in stock_list:
try:
r = s.get(
f'https://api.tickertape.in/search?text={ticker.lower()}&types=stock,brands,index,etf,mutualfund').json()
stock_id = r['data']['stocks'][0]['sid']
name = r['data']['stocks'][0]['name']
other_data[stock_id] = r
r = s.get(
f'https://api.tickertape.in/stocks/investmentChecklists/{stock_id}?type=basic').json()
d = {i['title']: i['description'] for i in r['data']}
d = {**{'Security': name}, **other_data[stock_id]['data']['stocks'][0]['quote'], **{
'marketCap': other_data[stock_id]['data']['stocks'][0]['marketCap']}, **d}
results.append(d)
except Exception as e:
print(ticker, e)
df = pd.DataFrame(results)
df
我正在尝试从网络浏览器中提取与股票市场相关的数据。我可以打开网络浏览器并提取一只股票的数据。
下面是“One stock”的 python 代码,它使用 Selenium Webdriver 打开 Web 浏览器并使用 Beautifulsoup
从网页中提取数据这是非常基本的代码,需要进行简化并能够提取如下所示的股票列表数据
stock_list=['Infosys' , 'Reliance industries', 'wipro' ]
我不确定如何提取列表中多个项目的数据,并以此为基础进行简化。
Python 提取一只股票数据的代码。
import requests
import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium.webdriver.support import expected_conditions as EC
headers = {'User-Agent': 'Mozilla/5.0'}
browser = webdriver.Firefox()
browser.get("https://www.tickertape.in/stocks/")
browser.maximize_window()
inputElement=browser.find_element_by_id('search-stock-input')
inputElement.click()
inputElement.send_keys('Infosys')
inputElement.click()
inputElement = wait(browser, 5).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#search-stock-input")))
inputElement.click()
inputElement.send_keys(Keys.RETURN)
page = requests.get(browser.current_url,headers=headers)
from bs4 import BeautifulSoup
soup = BeautifulSoup(page.text, 'html.parser')
ScriptName = []
ScriptName_elem = soup.find_all( class_ = 'jsx-2256451 security-name')
for item in ScriptName_elem:
ScriptName.append(item.text)
intrinsic_value = []
intrinsic_value_elem = soup.find_all( class_ = 'jsx-3277407410 jsx-1058798148 lh-138 text-13 commentary-desc')
for item in intrinsic_value_elem:
intrinsic_value.append(item.text)
Returns_vs_FD_rates = []
Returns_vs_FD_rates_elem = soup.find_all( class_ = 'jsx-3947392323 jsx-1058798148 lh-138 text-13 commentary-desc')
for item in Returns_vs_FD_rates_elem:
Returns_vs_FD_rates.append(item.text)
Divident_Returns = []
Divident_Returns_elem = soup.find_all( class_ = 'jsx-566496888 jsx-1058798148 lh-138 text-13 commentary-desc')
for item in Divident_Returns_elem:
Divident_Returns.append(item.text)
Entry_Point = []
Entry_Point_elem = soup.find_all( class_ = 'jsx-3697483086 jsx-1058798148 lh-138 text-13 commentary-desc')
for item in Entry_Point_elem:
Entry_Point.append(item.text)
Red_Flag_Indicator = []
Red_Flag_Indicator_elem = soup.find_all( class_ = 'jsx-1920835126 jsx-1058798148 relative no-select tooltip-holder')
for item in Red_Flag_Indicator_elem:
Red_Flag_Indicator.append(item.text)
Red_Flag_Indicator_Reason = []
Red_Flag_Indicator_Reason_elem = soup.find_all( class_ = 'jsx-1920835126 jsx-1058798148 lh-138 text-13 commentary-desc')
for item in Red_Flag_Indicator_Reason_elem:
Red_Flag_Indicator_Reason.append(item.text)
df_array = []
for ScriptName_n, intrinsic_value_n,Returns_vs_FD_rates_n,Divident_Returns_n,Entry_Point_n,Red_Flag_Indicator_n,Red_Flag_Indicator_Reason_n in zip(ScriptName,intrinsic_value,Returns_vs_FD_rates,Divident_Returns,Entry_Point,Red_Flag_Indicator,Red_Flag_Indicator_Reason):
df_array.append({'ScriptName': ScriptName_n, 'intrinsic_value': intrinsic_value_n, 'Returns_vs_FD_rates' : Returns_vs_FD_rates_n, 'Divident_Returns' : Divident_Returns_n, 'Entry_Point' : Entry_Point_n,
'Red_Flag_Indicator' : Red_Flag_Indicator_n , 'Red_Flag_Indicator_Reason' : Red_Flag_Indicator_Reason_n })
df = pd.DataFrame(df_array)
df
提前致谢
您可以调用与页面相同的 APIs。第一个 API 获取 id 和证券名称,供股票用于第二个 API returns 那些清单项目。
如果您创建一个字典列表,每个代码一个字典,您可以在最后转换为数据框。如果我错过了一件物品,请告诉我。我还选择存储许多其他数据,例如另一个名为 other_data
.
import requests
import pandas as pd
other_data = {}
results = []
stock_list = ['Infosys', 'Reliance industries', 'wipro']
with requests.Session() as s:
for ticker in stock_list:
try:
r = s.get(
f'https://api.tickertape.in/search?text={ticker.lower()}&types=stock,brands,index,etf,mutualfund').json()
stock_id = r['data']['stocks'][0]['sid']
name = r['data']['stocks'][0]['name']
other_data[stock_id] = r
r = s.get(
f'https://api.tickertape.in/stocks/investmentChecklists/{stock_id}?type=basic').json()
d = {i['title']: i['description'] for i in r['data']}
d = {**{'Security': name}, **other_data[stock_id]['data']['stocks'][0]['quote'], **{
'marketCap': other_data[stock_id]['data']['stocks'][0]['marketCap']}, **d}
results.append(d)
except Exception as e:
print(ticker, e)
df = pd.DataFrame(results)
df