Linkedin Sales Navigator 中的抓取问题
Scraping issues in Linkedin Sales Navigator
我正在尝试从 Linkedin Sales Navigator 中抓取一些公司及其潜在客户的详细信息。为了登录,我创建了一个名为 config.txt 的文本文件,其中包含用户名和密码。问题是,它登录成功,只是显示另一个登录页面。
所以,例如:如果我通过 https://www.linkedin.com/checkpoint/rm/sign-in-another-account it logins successfully but then straightaway gives me another login page like: https://www.linkedin.com/sales/login
登录
如果我重复第二次 url 的过程,那么理想情况下它应该给我 salesnavigator 的主页,但它再次给我相同的页面,即。 https://www.linkedin.com/sales/login
这是我的代码:
def linkedin_scraper():
print("Started Successfully.")
browser = webdriver.Chrome(ChromeDriverManager().install())
browser.get('https://www.linkedin.com/checkpoint/rm/sign-in-another-account')
file = open('config.txt')
lines = file.readlines()
username = lines[0]
password = lines[1]
time.sleep(1)
usernameID = browser.find_element_by_id('username')
usernameID.send_keys(username)
time.sleep(1)
passwordID = browser.find_element_by_id('password')
passwordID.send_keys(password)
time.sleep(1)
browser.get('https://www.linkedin.com/sales/search/company?geoIncluded=102713980&industryIncluded=106%2C45&jobOpportunities=JO1')
time.sleep(1)
#maximizing window
browser.maximize_window()
# rest of code
到底哪里失败了?我不知道为什么会这样。请告诉我。
提前致谢。
Linkedin 登录码
正如 Ben 在评论中所说,Linkedin 使用机器人检测器,出于同样的原因,您无法登录。为此,您使用了一些额外的 chrome 选项。
以下代码片段将解决您的问题
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
email = ""
password = ""
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--start-maximized")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-notifications")
chrome_options.add_experimental_option('excludeSwitches', ['enable-logging'])
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
chrome_options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=chrome_options,executable_path="chromedriver.exe")
driver.get("https://www.linkedin.com/") WebDriverWait(driver,5).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,"#session_key")))
driver.find_element_by_css_selector('#session_key').send_keys(email)
driver.find_element_by_css_selector('#session_password').send_keys(password)
driver.find_element_by_css_selector("body > main > section.section.section--hero > div.sign-in-form-container > form > button").click()
WebDriverWait(driver, 100).until(EC.presence_of_element_located((By.ID, "global-nav")))
print("Login Successful.")
我正在尝试从 Linkedin Sales Navigator 中抓取一些公司及其潜在客户的详细信息。为了登录,我创建了一个名为 config.txt 的文本文件,其中包含用户名和密码。问题是,它登录成功,只是显示另一个登录页面。
所以,例如:如果我通过 https://www.linkedin.com/checkpoint/rm/sign-in-another-account it logins successfully but then straightaway gives me another login page like: https://www.linkedin.com/sales/login
登录如果我重复第二次 url 的过程,那么理想情况下它应该给我 salesnavigator 的主页,但它再次给我相同的页面,即。 https://www.linkedin.com/sales/login
这是我的代码:
def linkedin_scraper():
print("Started Successfully.")
browser = webdriver.Chrome(ChromeDriverManager().install())
browser.get('https://www.linkedin.com/checkpoint/rm/sign-in-another-account')
file = open('config.txt')
lines = file.readlines()
username = lines[0]
password = lines[1]
time.sleep(1)
usernameID = browser.find_element_by_id('username')
usernameID.send_keys(username)
time.sleep(1)
passwordID = browser.find_element_by_id('password')
passwordID.send_keys(password)
time.sleep(1)
browser.get('https://www.linkedin.com/sales/search/company?geoIncluded=102713980&industryIncluded=106%2C45&jobOpportunities=JO1')
time.sleep(1)
#maximizing window
browser.maximize_window()
# rest of code
到底哪里失败了?我不知道为什么会这样。请告诉我。 提前致谢。
Linkedin 登录码
正如 Ben 在评论中所说,Linkedin 使用机器人检测器,出于同样的原因,您无法登录。为此,您使用了一些额外的 chrome 选项。
以下代码片段将解决您的问题
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
email = ""
password = ""
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--start-maximized")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-notifications")
chrome_options.add_experimental_option('excludeSwitches', ['enable-logging'])
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
chrome_options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=chrome_options,executable_path="chromedriver.exe")
driver.get("https://www.linkedin.com/") WebDriverWait(driver,5).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,"#session_key")))
driver.find_element_by_css_selector('#session_key').send_keys(email)
driver.find_element_by_css_selector('#session_password').send_keys(password)
driver.find_element_by_css_selector("body > main > section.section.section--hero > div.sign-in-form-container > form > button").click()
WebDriverWait(driver, 100).until(EC.presence_of_element_located((By.ID, "global-nav")))
print("Login Successful.")