Python 网页抓取 - 未能从网站提取名单
Python Web Scraping - Failed to extract a name list from the website
我无法从网站中提取第一列“姓名”。有没有人可以帮忙?网址为:https://www.dianashippinginc.com/the-fleet/
'''
chromedriver_location = ""
driver = webdriver.Chrome(chromedriver_location)
driver.get('https://www.dianashippinginc.com/fleet-employment-table/')
cookie_address = '//*[@id="ccc-notify-accept"]/span'
name_address = '/html/body/div[3]/div/div/div[2]/table/tbody/tr[3]/td[2]/span'
driver.find_element_by_xpath(cookie_address).click()
driver.find_element_by_xpath(name_address)
'''
import scrapy
class MySpider(scrapy.Spider):
name = 'myspider'
start_urls = [r'https://www.dianashippinginc.com/the-fleet/']
def parse(self, response):
names = response.xpath('//div[@class="fleet-vessels__table_cell--norm-btn"]/text()').getall()
# Process the names list to be as you want (remove tab characters, ranking numbers etc.)
yield names
我无法从网站中提取第一列“姓名”。有没有人可以帮忙?网址为:https://www.dianashippinginc.com/the-fleet/
'''
chromedriver_location = ""
driver = webdriver.Chrome(chromedriver_location)
driver.get('https://www.dianashippinginc.com/fleet-employment-table/')
cookie_address = '//*[@id="ccc-notify-accept"]/span'
name_address = '/html/body/div[3]/div/div/div[2]/table/tbody/tr[3]/td[2]/span'
driver.find_element_by_xpath(cookie_address).click()
driver.find_element_by_xpath(name_address)
'''
import scrapy
class MySpider(scrapy.Spider):
name = 'myspider'
start_urls = [r'https://www.dianashippinginc.com/the-fleet/']
def parse(self, response):
names = response.xpath('//div[@class="fleet-vessels__table_cell--norm-btn"]/text()').getall()
# Process the names list to be as you want (remove tab characters, ranking numbers etc.)
yield names