Google 金融股票筛选器 - Python (Scrapy)
Google Finance Stock Screener - Python (Scrapy)
我正在尝试使用 scrapy 从 google 金融中获取股票价格。代码未显示任何错误,但输出文件显示为空白。
粘贴下面的代码:
import scrapy
bse_list=['quote/ABB:NSE','quote/AEGISLOG:NSE','quote/AMARAJABAT:NSE','quote/AMBALALSA:NSE','quote/HDFC:NSE','quote/ANDHRAPET:NSE','quote/ANSALAPI:NSE']
class CrawlSpider(scrapy.Spider):
name = 'crawl'
allowed_domains = ['www.google.com/finance/']
start_urls = ['https://google.com/finance/']
def parse(self, response):
for stock in bse_list:
url_new = response.urljoin(stock)
yield scrapy.Request(url_new, callback = self.parse_book)
def parse_book(self, response):
stock_name = response.xpath('//*[@class="zzDege"]/text()').extract_first()
current_price = response.xpath('//*[@class="YMlKec fxKbKc"]/text()').extract_first()
stock_info = response.xpath('//*[@class="P6K39c"]/text()').extract()
last_closing_price = stock_info[0]
day_range = stock_info[1]
year_range = stock_info[2]
market_cap = stock_info[3]
p_e_ratio = stock_inf[4]
yield {
"stock_name": stock_name,
"current_price": current_price,
"last_closing_price": last_closing_price,
"day_range": day_range,
"year_range": year_range,
"market_cap": market_cap,
"p_e_ratio": p_e_ratio
}
问题出在股票信息选择上,其余代码工作正常。
import scrapy
bse_list = ['quote/ABB:NSE', 'quote/AEGISLOG:NSE', 'quote/AMARAJABAT:NSE',
'quote/AMBALALSA:NSE', 'quote/HDFC:NSE', 'quote/ANDHRAPET:NSE', 'quote/ANSALAPI:NSE']
class CrlSpider(scrapy.Spider):
name = 'crl'
start_urls = ['https://google.com/finance/']
def parse(self, response):
for stock in bse_list:
url_new = response.urljoin(stock)
yield scrapy.Request(url_new, callback=self.parse_book)
def parse_book(self, response):
stock_name = response.xpath('//*[@class="zzDege"]/text()').extract_first()
current_price = response.xpath('//*[@class="YMlKec fxKbKc"]/text()').extract_first()
#stock_info = response.xpath('//*[@class="P6K39c"]/text()').extract()
#last_closing_price = stock_info[0]
# day_range = stock_info[1]
# year_range = stock_info[2]
# market_cap = stock_info[3]
# p_e_ratio = stock_inf[4]
yield {
"stock_name": stock_name,
"current_price": current_price,
#"last_closing_price": last_closing_price,
# "day_range": day_range,
# "year_range": year_range,
# "market_cap": market_cap,
# "p_e_ratio": p_e_ratio
}
输出:
{'stock_name': 'Ansal Properties and Infrastructure Ltd', 'current_price': '₹13.30'}
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/ANDHRAPET:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/AMBALALSA:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/AEGISLOG:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/ABB:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:09 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/HDFC:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:09 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/AMARAJABAT:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/ANDHRAPET:NSE>
{'stock_name': None, 'current_price': None}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/AMBALALSA:NSE>
{'stock_name': None, 'current_price': None}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/AEGISLOG:NSE>
{'stock_name': None, 'current_price': None}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/ABB:NSE>
{'stock_name': 'ABB India Ltd', 'current_price': '₹2,139.00'}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/HDFC:NSE>
{'stock_name': 'Housing Development Finance Corp Ltd', 'current_price': '₹2,994.15'}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/AMARAJABAT:NSE>
{'stock_name': 'Amara Raja Batteries Ltd', 'current_price': '₹685.40'}
我正在尝试使用 scrapy 从 google 金融中获取股票价格。代码未显示任何错误,但输出文件显示为空白。
粘贴下面的代码:
import scrapy
bse_list=['quote/ABB:NSE','quote/AEGISLOG:NSE','quote/AMARAJABAT:NSE','quote/AMBALALSA:NSE','quote/HDFC:NSE','quote/ANDHRAPET:NSE','quote/ANSALAPI:NSE']
class CrawlSpider(scrapy.Spider):
name = 'crawl'
allowed_domains = ['www.google.com/finance/']
start_urls = ['https://google.com/finance/']
def parse(self, response):
for stock in bse_list:
url_new = response.urljoin(stock)
yield scrapy.Request(url_new, callback = self.parse_book)
def parse_book(self, response):
stock_name = response.xpath('//*[@class="zzDege"]/text()').extract_first()
current_price = response.xpath('//*[@class="YMlKec fxKbKc"]/text()').extract_first()
stock_info = response.xpath('//*[@class="P6K39c"]/text()').extract()
last_closing_price = stock_info[0]
day_range = stock_info[1]
year_range = stock_info[2]
market_cap = stock_info[3]
p_e_ratio = stock_inf[4]
yield {
"stock_name": stock_name,
"current_price": current_price,
"last_closing_price": last_closing_price,
"day_range": day_range,
"year_range": year_range,
"market_cap": market_cap,
"p_e_ratio": p_e_ratio
}
问题出在股票信息选择上,其余代码工作正常。
import scrapy
bse_list = ['quote/ABB:NSE', 'quote/AEGISLOG:NSE', 'quote/AMARAJABAT:NSE',
'quote/AMBALALSA:NSE', 'quote/HDFC:NSE', 'quote/ANDHRAPET:NSE', 'quote/ANSALAPI:NSE']
class CrlSpider(scrapy.Spider):
name = 'crl'
start_urls = ['https://google.com/finance/']
def parse(self, response):
for stock in bse_list:
url_new = response.urljoin(stock)
yield scrapy.Request(url_new, callback=self.parse_book)
def parse_book(self, response):
stock_name = response.xpath('//*[@class="zzDege"]/text()').extract_first()
current_price = response.xpath('//*[@class="YMlKec fxKbKc"]/text()').extract_first()
#stock_info = response.xpath('//*[@class="P6K39c"]/text()').extract()
#last_closing_price = stock_info[0]
# day_range = stock_info[1]
# year_range = stock_info[2]
# market_cap = stock_info[3]
# p_e_ratio = stock_inf[4]
yield {
"stock_name": stock_name,
"current_price": current_price,
#"last_closing_price": last_closing_price,
# "day_range": day_range,
# "year_range": year_range,
# "market_cap": market_cap,
# "p_e_ratio": p_e_ratio
}
输出:
{'stock_name': 'Ansal Properties and Infrastructure Ltd', 'current_price': '₹13.30'}
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/ANDHRAPET:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/AMBALALSA:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/AEGISLOG:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/ABB:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:09 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/HDFC:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:09 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/AMARAJABAT:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/ANDHRAPET:NSE>
{'stock_name': None, 'current_price': None}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/AMBALALSA:NSE>
{'stock_name': None, 'current_price': None}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/AEGISLOG:NSE>
{'stock_name': None, 'current_price': None}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/ABB:NSE>
{'stock_name': 'ABB India Ltd', 'current_price': '₹2,139.00'}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/HDFC:NSE>
{'stock_name': 'Housing Development Finance Corp Ltd', 'current_price': '₹2,994.15'}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/AMARAJABAT:NSE>
{'stock_name': 'Amara Raja Batteries Ltd', 'current_price': '₹685.40'}