Python 从雅虎财经抓取股票代码的代码
Python code to scrape ticker symbols from Yahoo finance
我有超过 1000 家公司的名单,我可以用来投资。我需要所有这些公司的股票代码 ID。当我试图去除汤的输出时,以及当我试图遍历所有公司名称时,我发现困难。
请查看网站示例:https://finance.yahoo.com/lookup?s=asml
。我的想法是替换 asml 并放入 'https://finance.yahoo.com/lookup?s='+ Companies
.,这样我就可以遍历所有公司。
companies=df
Company name
0 Abbott Laboratories
1 ABBVIE
2 Abercrombie
3 Abiomed
4 Accenture Plc
这是我现在的代码,条形码不起作用,而且所有公司的循环也不起作用。
#Create a function to scrape the data
def scrape_stock_symbols():
Companies=df
url= 'https://finance.yahoo.com/lookup?s='+ Companies
page= requests.get(url)
soup = BeautifulSoup(page.text, "html.parser")
Company_Symbol=Soup.find_all('td',attrs ={'class':'data-col0 Ta(start) Pstart(6px) Pend(15px)'})
for i in company_symbol:
try:
row = i.find_all('td')
company_symbol.append(row[0].text.strip())
except Exception:
if company not in company_symbol:
next(Company)
return (company_symbol)
#Loop through every company in companies to get all of the tickers from the website
for Company in companies:
try:
(temp_company_symbol) = scrape_stock_symbols(company)
except Exception:
if company not in companies:
next(Company)
另一个难点是从yahoo finance 查找符号会检索到许多公司名称。
之后我将不得不清除数据。我想将 AMS 交易所设置为标准,因此如果一家公司在多个交易所上市,我只对 AMS 股票代码感兴趣。最终目标是创建一个新的数据框:
Comapny name Company_symbol
0 Abbott Laboratories ABT
1 ABBVIE ABBV
2 Abercrombie ANF
这是一个不需要任何抓取的解决方案。它使用一个名为 yahooquery 的包(免责声明:我是作者),它利用一个 API 端点,该端点为用户查询提供 returns 符号。你可以这样做:
import pandas as pd
import yahooquery as yq
def get_symbol(query, preferred_exchange='AMS'):
try:
data = yq.search(query)
except ValueError: # Will catch JSONDecodeError
print(query)
else:
quotes = data['quotes']
if len(quotes) == 0:
return 'No Symbol Found'
symbol = quotes[0]['symbol']
for quote in quotes:
if quote['exchange'] == preferred_exchange:
symbol = quote['symbol']
break
return symbol
companies = ['Abbott Laboratories', 'ABBVIE', 'Abercrombie', 'Abiomed', 'Accenture Plc']
df = pd.DataFrame({'Company name': companies})
df['Company symbol'] = df.apply(lambda x: get_symbol(x['Company name']), axis=1)
Company name Company symbol
0 Abbott Laboratories ABT
1 ABBVIE ABBV
2 Abercrombie ANF
3 Abiomed ABMD
4 Accenture Plc ACN
我有超过 1000 家公司的名单,我可以用来投资。我需要所有这些公司的股票代码 ID。当我试图去除汤的输出时,以及当我试图遍历所有公司名称时,我发现困难。
请查看网站示例:https://finance.yahoo.com/lookup?s=asml
。我的想法是替换 asml 并放入 'https://finance.yahoo.com/lookup?s='+ Companies
.,这样我就可以遍历所有公司。
companies=df
Company name
0 Abbott Laboratories
1 ABBVIE
2 Abercrombie
3 Abiomed
4 Accenture Plc
这是我现在的代码,条形码不起作用,而且所有公司的循环也不起作用。
#Create a function to scrape the data
def scrape_stock_symbols():
Companies=df
url= 'https://finance.yahoo.com/lookup?s='+ Companies
page= requests.get(url)
soup = BeautifulSoup(page.text, "html.parser")
Company_Symbol=Soup.find_all('td',attrs ={'class':'data-col0 Ta(start) Pstart(6px) Pend(15px)'})
for i in company_symbol:
try:
row = i.find_all('td')
company_symbol.append(row[0].text.strip())
except Exception:
if company not in company_symbol:
next(Company)
return (company_symbol)
#Loop through every company in companies to get all of the tickers from the website
for Company in companies:
try:
(temp_company_symbol) = scrape_stock_symbols(company)
except Exception:
if company not in companies:
next(Company)
另一个难点是从yahoo finance 查找符号会检索到许多公司名称。 之后我将不得不清除数据。我想将 AMS 交易所设置为标准,因此如果一家公司在多个交易所上市,我只对 AMS 股票代码感兴趣。最终目标是创建一个新的数据框:
Comapny name Company_symbol
0 Abbott Laboratories ABT
1 ABBVIE ABBV
2 Abercrombie ANF
这是一个不需要任何抓取的解决方案。它使用一个名为 yahooquery 的包(免责声明:我是作者),它利用一个 API 端点,该端点为用户查询提供 returns 符号。你可以这样做:
import pandas as pd
import yahooquery as yq
def get_symbol(query, preferred_exchange='AMS'):
try:
data = yq.search(query)
except ValueError: # Will catch JSONDecodeError
print(query)
else:
quotes = data['quotes']
if len(quotes) == 0:
return 'No Symbol Found'
symbol = quotes[0]['symbol']
for quote in quotes:
if quote['exchange'] == preferred_exchange:
symbol = quote['symbol']
break
return symbol
companies = ['Abbott Laboratories', 'ABBVIE', 'Abercrombie', 'Abiomed', 'Accenture Plc']
df = pd.DataFrame({'Company name': companies})
df['Company symbol'] = df.apply(lambda x: get_symbol(x['Company name']), axis=1)
Company name Company symbol
0 Abbott Laboratories ABT
1 ABBVIE ABBV
2 Abercrombie ANF
3 Abiomed ABMD
4 Accenture Plc ACN