尝试使用 Selenium 遍历配置文件列表
Trying to loop through profile lists using Selenium
我正在尝试遍历所有配置文件并将人员姓名、工作配置文件和位置存储在列表中。这是我所在的 LinkedIn 屏幕截图:
这是我必须循环的 li html 标签:
<li class="reusable-search__result-container ">
<div class="entity-result ">
<div class="entity-result__item">
<div class="entity-result__image">
<div class="display-flex align-items-center">
<a class="app-aware-link" aria-hidden="true" href="https://www.linkedin.com/search/results/people/headless?geoUrn=%5B103644278%5D&origin=FACETED_SEARCH&keywords=python%20developer">
<div id="ember522" class="ivm-image-view-model ember-view"> <div class="
ivm-view-attr__img-wrapper ivm-view-attr__img-wrapper--use-img-tag display-flex
">
<div class="EntityPhoto-circle-3-ghost-person ivm-view-attr__ghost-entity ">
<!----> </div>
</div>
</div>
</a>
</div>
</div>
<div class="entity-result__content entity-result__divider pt3 pb3 t-12 t-black--light">
<div class="mb1">
<div class="linked-area flex-1 cursor-pointer">
<div class="t-roman t-sans">
<span class="entity-result__title">
<div class="display-flex">
<span class="entity-result__title-line flex-shrink-1 entity-result__title-text--black ">
<span class="entity-result__title-text t-16">
<a class="app-aware-link" href="https://www.linkedin.com/search/results/people/headless?geoUrn=%5B103644278%5D&origin=FACETED_SEARCH&keywords=python%20developer">
<!---->LinkedIn Member<!---->
</a>
<!----> </span>
</span>
<!----></div>
</span>
</div>
<div>
<div class="entity-result__primary-subtitle t-14 t-black">
<!---->Software Developer<!---->
</div>
<div class="entity-result__secondary-subtitle t-14">
<!---->United States<!---->
</div>
</div>
</div>
</div>
<div class="linked-area flex-1 cursor-pointer">
<p class="entity-result__summary entity-result__summary--2-lines t-12 t-black--light ">
<!---->Current: Full Stack Software<span class="white-space-pre"> </span><strong><!---->Developer<!----></strong><span class="white-space-pre"> </span>at GE Healthcare<!---->
</p>
</div>
<!----> </div>
<div class="entity-result__actions entity-result__divider entity-result__actions--empty">
<!----> <!---->
</div>
</div>
</div>
</li>
目前,我可以使用以下代码获取个人资料名称:
profile_names = []
linkedin_members = browser.find_elements_by_xpath('//span[@class="entity-result__title"]')
for linkedin_member in linkedin_members:
name = linkedin_member.find_element_by_xpath('.//a[@class="app-aware-link"]').get_attribute('text').strip()
profile_names.append(name)
但我无法获取工作地点和工作简介。谁能指导我编写代码?
我尝试过类似的操作,但出现错误:
profile_names = []
job_profiles = []
linkedin_members = browser.find_elements_by_xpath('//div[@class="linked-area flex-1 cursor-pointer"]')
for linkedin_member in linkedin_members:
name = linkedin_member.find_element_by_xpath('.//a[@class="app-aware-link"]').get_attribute('text').strip()
job_profile = linkedin_member.find_element_by_xpath('.//div[@class="entity-result__primary-subtitle"]').text
profile_names.append(name)
job_profiles.append(job_profiles)
你只需要识别那些元素(我认为你可以使用 class 和 css 选择器来做到这一点),然后遍历元素并将文本附加到适当的数组.
profile_names = []
linkedin_members = browser.find_elements_by_xpath('//span[@class="entity-result__title"]')
for linkedin_member in linkedin_members:
name = linkedin_member.find_element_by_xpath('.//a[@class="app-aware-link"]').get_attribute('text').strip()
profile_names.append(name)
user_positions = []
positions = browser.find_elements_by_css_selector('div.entity-result__primary-subtitle')
for position in positions:
user_positions.append(position.text.strip())
user_locations = []
locations = browser.find_elements_by_css_selector('div.entity-result__secondary-subtitle')
for location in locations:
user_locations.append(location.text.strip())
另一种方法是:
members_serach_results_xpath = '//div[@class="entity-result__item"]'
member_name_xpath = '//span[contains(@class,"entity-result__title-text")]//span[@dir]'
member_location_xpath = '//div[contains(@class,"entity-result__secondary-subtitle")]'
member_job_title_xpath = '//div[@class="entity-result__item"]//div[contains(@class,"entity-result__primary-subtitle")]'
profile_names = []
profile_addresses = []
profile_job_titles = []
linkedin_members = browser.find_elements_by_xpath(members_serach_results_xpath)
for linkedin_member in linkedin_members:
name = linkedin_member.find_element_by_xpath('.' + member_name_xpath).get_attribute('text').strip()
profile_names.append(name)
address = linkedin_member.find_element_by_xpath('.' + member_location_xpath).get_attribute('text').strip()
profile_addresses.append(address)
job_title = linkedin_member.find_element_by_xpath('.' + member_job_title_xpath).get_attribute('text').strip()
profile_job_titles.append(job_title)
这里我把定位器作为参数放在了代码之外。
最好的做法之一是不要将定位器硬编码到使用它的方法中。
我正在尝试遍历所有配置文件并将人员姓名、工作配置文件和位置存储在列表中。这是我所在的 LinkedIn 屏幕截图:
这是我必须循环的 li html 标签:
<li class="reusable-search__result-container ">
<div class="entity-result ">
<div class="entity-result__item">
<div class="entity-result__image">
<div class="display-flex align-items-center">
<a class="app-aware-link" aria-hidden="true" href="https://www.linkedin.com/search/results/people/headless?geoUrn=%5B103644278%5D&origin=FACETED_SEARCH&keywords=python%20developer">
<div id="ember522" class="ivm-image-view-model ember-view"> <div class="
ivm-view-attr__img-wrapper ivm-view-attr__img-wrapper--use-img-tag display-flex
">
<div class="EntityPhoto-circle-3-ghost-person ivm-view-attr__ghost-entity ">
<!----> </div>
</div>
</div>
</a>
</div>
</div>
<div class="entity-result__content entity-result__divider pt3 pb3 t-12 t-black--light">
<div class="mb1">
<div class="linked-area flex-1 cursor-pointer">
<div class="t-roman t-sans">
<span class="entity-result__title">
<div class="display-flex">
<span class="entity-result__title-line flex-shrink-1 entity-result__title-text--black ">
<span class="entity-result__title-text t-16">
<a class="app-aware-link" href="https://www.linkedin.com/search/results/people/headless?geoUrn=%5B103644278%5D&origin=FACETED_SEARCH&keywords=python%20developer">
<!---->LinkedIn Member<!---->
</a>
<!----> </span>
</span>
<!----></div>
</span>
</div>
<div>
<div class="entity-result__primary-subtitle t-14 t-black">
<!---->Software Developer<!---->
</div>
<div class="entity-result__secondary-subtitle t-14">
<!---->United States<!---->
</div>
</div>
</div>
</div>
<div class="linked-area flex-1 cursor-pointer">
<p class="entity-result__summary entity-result__summary--2-lines t-12 t-black--light ">
<!---->Current: Full Stack Software<span class="white-space-pre"> </span><strong><!---->Developer<!----></strong><span class="white-space-pre"> </span>at GE Healthcare<!---->
</p>
</div>
<!----> </div>
<div class="entity-result__actions entity-result__divider entity-result__actions--empty">
<!----> <!---->
</div>
</div>
</div>
</li>
目前,我可以使用以下代码获取个人资料名称:
profile_names = []
linkedin_members = browser.find_elements_by_xpath('//span[@class="entity-result__title"]')
for linkedin_member in linkedin_members:
name = linkedin_member.find_element_by_xpath('.//a[@class="app-aware-link"]').get_attribute('text').strip()
profile_names.append(name)
但我无法获取工作地点和工作简介。谁能指导我编写代码?
我尝试过类似的操作,但出现错误:
profile_names = []
job_profiles = []
linkedin_members = browser.find_elements_by_xpath('//div[@class="linked-area flex-1 cursor-pointer"]')
for linkedin_member in linkedin_members:
name = linkedin_member.find_element_by_xpath('.//a[@class="app-aware-link"]').get_attribute('text').strip()
job_profile = linkedin_member.find_element_by_xpath('.//div[@class="entity-result__primary-subtitle"]').text
profile_names.append(name)
job_profiles.append(job_profiles)
你只需要识别那些元素(我认为你可以使用 class 和 css 选择器来做到这一点),然后遍历元素并将文本附加到适当的数组.
profile_names = []
linkedin_members = browser.find_elements_by_xpath('//span[@class="entity-result__title"]')
for linkedin_member in linkedin_members:
name = linkedin_member.find_element_by_xpath('.//a[@class="app-aware-link"]').get_attribute('text').strip()
profile_names.append(name)
user_positions = []
positions = browser.find_elements_by_css_selector('div.entity-result__primary-subtitle')
for position in positions:
user_positions.append(position.text.strip())
user_locations = []
locations = browser.find_elements_by_css_selector('div.entity-result__secondary-subtitle')
for location in locations:
user_locations.append(location.text.strip())
另一种方法是:
members_serach_results_xpath = '//div[@class="entity-result__item"]'
member_name_xpath = '//span[contains(@class,"entity-result__title-text")]//span[@dir]'
member_location_xpath = '//div[contains(@class,"entity-result__secondary-subtitle")]'
member_job_title_xpath = '//div[@class="entity-result__item"]//div[contains(@class,"entity-result__primary-subtitle")]'
profile_names = []
profile_addresses = []
profile_job_titles = []
linkedin_members = browser.find_elements_by_xpath(members_serach_results_xpath)
for linkedin_member in linkedin_members:
name = linkedin_member.find_element_by_xpath('.' + member_name_xpath).get_attribute('text').strip()
profile_names.append(name)
address = linkedin_member.find_element_by_xpath('.' + member_location_xpath).get_attribute('text').strip()
profile_addresses.append(address)
job_title = linkedin_member.find_element_by_xpath('.' + member_job_title_xpath).get_attribute('text').strip()
profile_job_titles.append(job_title)
这里我把定位器作为参数放在了代码之外。
最好的做法之一是不要将定位器硬编码到使用它的方法中。