Selenium/python 点击打开多行
Selenium/python click to open multiple rows
我正在尝试抓取一堆行中的数据。我可以使用以下方法扩展单个行:
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, '//*[@id="7858101"]'))).click()
问题是每一行都有不同的 ID。他们有共同的 class 名字所以我也试过:
WebDriverWait(driver, 60).until(EC.presence_of_elements_located((By.CLASS_NAME, 'course-row normal faculty-BU active'))).click()
我在下面附上了几行关于如何解决这个问题的任何建议
<tr id="7858101" class="course-row normal faculty-BU active" data-cid="7858101" data-cc="ACTG1P01" data-year="2021" data-session="FW" data-type="UG" data-subtype="UG" data-level="Year1" data-fn2_notes="BB" data-duration="2" data-class_type="ASY" data-course_section="1" data-days=" " data-class_time="" data-room1="ASYNC" data-room2="" data-location="ASYNC" data-location_desc="" data-instructor="Zhang, Xia (Celine)" data-msg="0" data-main_flag="1" data-secondary_type="E" data-startdate="1631073600" data-enddate="1638853200" data-faculty_code="BU" data-faculty_desc="Goodman School of Business">
<td class="arrow"><span class="fa fa-angle-down"></span></td>
<td class="course-code">ACTG 1P01 </td>
<td class="title"><a href="#" data-cc="ACTG1P01" data-cid="7858101">Introduction to Financial Accounting</a> <div class="details-loader" style="display: none;"><span class="fa fa-refresh fa-spin fa-fw"></span></div></td>
<td class="duration">D2</td>
<td class="days"> </td>
<td class="time"> </td>
<!-- <td class="start" data-sort-value="1631073600">Sep 08, 2021</td> -->
<!-- <td class="end" data-sort-value="1638853200">Dec 07, 2021</td> -->
<td class="type">ASY</td>
<td class="data"><div style="" class="course-details-data">
<div class="description">
<h3>Introduction to Financial Accounting</h3>
<p class="page-intro">Fundamental concepts of financial accounting as related to the balance sheet, income statement and statement of cash flows. Understanding the accounting cycle and routine transactions. Integrates both theoretical and practical application of accounting concepts.</p>
<p><strong>Format:</strong> Lectures, discussion, 3 hours per week.</p>
<p><strong>Restrictions:</strong> open to BAcc majors.</p>
<p><strong>Exclusions:</strong> Completion of this course will replace previous assigned grade and credit obtained in ACTG 1P11, 1P91 and 2P51.</p>
<p><strong>Notes:</strong> Open to Bachelor of Accounting majors. </p>
</div>
<div class="vitals">
<ul>
<li><strong>Duration:</strong> Sep 08, 2021 to Dec 07, 2021</li>
<li>
<strong>Location:</strong> ASYNC </li>
<li><strong>Instructor:</strong> Zhang, Xia (Celine)</li>
<li><strong>Section:</strong> 1</li>
</ul>
</div>
<hr>
</div>
</td>
</tr>
<tr id="3724102" class="course-row normal faculty-BU active" data-cid="3724102" data-cc="ACTG1P01" data-year="2021" data-session="FW" data-type="UG" data-subtype="UG" data-level="Year1" data-fn2_notes="BB" data-duration="2" data-class_type="LEC" data-course_section="2" data-days=" M R " data-class_time="1100-1230" data-room1="GSB306" data-room2="" data-location="GSB306" data-location_desc="" data-instructor="Zhang, Xia (Celine)" data-msg="0" data-main_flag="1" data-secondary_type="E" data-startdate="1631073600" data-enddate="1638853200" data-faculty_code="BU" data-faculty_desc="Goodman School of Business">
<td class="arrow"><span class="fa fa-angle-right"></span></td>
<td class="course-code">ACTG 1P01 </td>
<td class="title"><a href="#" data-cc="ACTG1P01" data-cid="3724102">Introduction to Financial Accounting</a> <div class="details-loader"><span class="fa fa-refresh fa-spin fa-fw"></span></div></td>
<td class="duration">D2</td>
<td class="days">
<table class="coursecal">
<thead>
<tr>
<th class="">S</th>
<th class="active">M</th>
<th class="">T</th>
<th class="">W</th>
<th class="active">T</th>
<th class="">F</th>
<th class="">S</th>
</tr>
</thead>
<tbody>
<tr>
<td class="weekend "></td>
<td class="active"></td>
<td class=""></td>
<td class=""></td>
<td class="active"></td>
<td class=""></td>
<td class="weekend "></td>
</tr>
</tbody>
</table>
</td>
<td class="time">1100-1230</td>
<!-- <td class="start" data-sort-value="1631073600">Sep 08, 2021</td> -->
<!-- <td class="end" data-sort-value="1638853200">Dec 07, 2021</td> -->
<td class="type">LEC</td>
<td class="data"></td>
</tr>
快到了...
您可以使用 driver.find_elements
方法检索所有相关 Web 元素的 list,然后单击它迭代列表中的每个元素。
由于 course-row normal faculty-BU active
实际上是几个 class 名称,而不是单个 class 名称,您应该在那里使用 XPath 或 CSS 选择器。
此外,建议在此处使用 visibility_of_element_located
预期条件,而不是 presence_of_elements_located
,因为即使网页元素最终未在页面上呈现,前一个条件也已满足,而 visibility_of_element_located
预期条件等待更成熟网络元素的状态
WebDriverWait(driver, 60).until(EC.visibility_of_element_located((By.XPATH, '//tr[@class = "course-row normal faculty-BU active"]')))
time.sleep(0.4) #short delay added to make ALL the elements loaded
elements = driver.find_element(By.XPATH, '//tr[@class = "course-row normal faculty-BU active"]')
for element in elements:
element.click()
#scrape the data you need here etc
由于 <tr>
的 id
属性具有动态值来识别所有 <tr>
并单击它们中的每一个,您需要按如下方式诱导 WebDriverWait for the visibility_of_all_elements_located() and you need to construct a dynamic :
使用CSS_SELECTOR:
elements = WebDriverWait(driver, 60).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tr.course-row.normal.faculty-BU.active[data-faculty_desc='Goodman School of Business'] a[data-cc][data-cid]")))
for element in elements:
element.click()
使用 XPATH:
elements = WebDriverWait(driver, 60).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr[@class='course-row normal faculty-BU active' and @data-faculty_desc='Goodman School of Business']//a[@data-cc and @data-cid]")))
for element in elements:
element.click()
我正在尝试抓取一堆行中的数据。我可以使用以下方法扩展单个行:
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, '//*[@id="7858101"]'))).click()
问题是每一行都有不同的 ID。他们有共同的 class 名字所以我也试过:
WebDriverWait(driver, 60).until(EC.presence_of_elements_located((By.CLASS_NAME, 'course-row normal faculty-BU active'))).click()
我在下面附上了几行关于如何解决这个问题的任何建议
<tr id="7858101" class="course-row normal faculty-BU active" data-cid="7858101" data-cc="ACTG1P01" data-year="2021" data-session="FW" data-type="UG" data-subtype="UG" data-level="Year1" data-fn2_notes="BB" data-duration="2" data-class_type="ASY" data-course_section="1" data-days=" " data-class_time="" data-room1="ASYNC" data-room2="" data-location="ASYNC" data-location_desc="" data-instructor="Zhang, Xia (Celine)" data-msg="0" data-main_flag="1" data-secondary_type="E" data-startdate="1631073600" data-enddate="1638853200" data-faculty_code="BU" data-faculty_desc="Goodman School of Business">
<td class="arrow"><span class="fa fa-angle-down"></span></td>
<td class="course-code">ACTG 1P01 </td>
<td class="title"><a href="#" data-cc="ACTG1P01" data-cid="7858101">Introduction to Financial Accounting</a> <div class="details-loader" style="display: none;"><span class="fa fa-refresh fa-spin fa-fw"></span></div></td>
<td class="duration">D2</td>
<td class="days"> </td>
<td class="time"> </td>
<!-- <td class="start" data-sort-value="1631073600">Sep 08, 2021</td> -->
<!-- <td class="end" data-sort-value="1638853200">Dec 07, 2021</td> -->
<td class="type">ASY</td>
<td class="data"><div style="" class="course-details-data">
<div class="description">
<h3>Introduction to Financial Accounting</h3>
<p class="page-intro">Fundamental concepts of financial accounting as related to the balance sheet, income statement and statement of cash flows. Understanding the accounting cycle and routine transactions. Integrates both theoretical and practical application of accounting concepts.</p>
<p><strong>Format:</strong> Lectures, discussion, 3 hours per week.</p>
<p><strong>Restrictions:</strong> open to BAcc majors.</p>
<p><strong>Exclusions:</strong> Completion of this course will replace previous assigned grade and credit obtained in ACTG 1P11, 1P91 and 2P51.</p>
<p><strong>Notes:</strong> Open to Bachelor of Accounting majors. </p>
</div>
<div class="vitals">
<ul>
<li><strong>Duration:</strong> Sep 08, 2021 to Dec 07, 2021</li>
<li>
<strong>Location:</strong> ASYNC </li>
<li><strong>Instructor:</strong> Zhang, Xia (Celine)</li>
<li><strong>Section:</strong> 1</li>
</ul>
</div>
<hr>
</div>
</td>
</tr>
<tr id="3724102" class="course-row normal faculty-BU active" data-cid="3724102" data-cc="ACTG1P01" data-year="2021" data-session="FW" data-type="UG" data-subtype="UG" data-level="Year1" data-fn2_notes="BB" data-duration="2" data-class_type="LEC" data-course_section="2" data-days=" M R " data-class_time="1100-1230" data-room1="GSB306" data-room2="" data-location="GSB306" data-location_desc="" data-instructor="Zhang, Xia (Celine)" data-msg="0" data-main_flag="1" data-secondary_type="E" data-startdate="1631073600" data-enddate="1638853200" data-faculty_code="BU" data-faculty_desc="Goodman School of Business">
<td class="arrow"><span class="fa fa-angle-right"></span></td>
<td class="course-code">ACTG 1P01 </td>
<td class="title"><a href="#" data-cc="ACTG1P01" data-cid="3724102">Introduction to Financial Accounting</a> <div class="details-loader"><span class="fa fa-refresh fa-spin fa-fw"></span></div></td>
<td class="duration">D2</td>
<td class="days">
<table class="coursecal">
<thead>
<tr>
<th class="">S</th>
<th class="active">M</th>
<th class="">T</th>
<th class="">W</th>
<th class="active">T</th>
<th class="">F</th>
<th class="">S</th>
</tr>
</thead>
<tbody>
<tr>
<td class="weekend "></td>
<td class="active"></td>
<td class=""></td>
<td class=""></td>
<td class="active"></td>
<td class=""></td>
<td class="weekend "></td>
</tr>
</tbody>
</table>
</td>
<td class="time">1100-1230</td>
<!-- <td class="start" data-sort-value="1631073600">Sep 08, 2021</td> -->
<!-- <td class="end" data-sort-value="1638853200">Dec 07, 2021</td> -->
<td class="type">LEC</td>
<td class="data"></td>
</tr>
快到了...
您可以使用 driver.find_elements
方法检索所有相关 Web 元素的 list,然后单击它迭代列表中的每个元素。
由于 course-row normal faculty-BU active
实际上是几个 class 名称,而不是单个 class 名称,您应该在那里使用 XPath 或 CSS 选择器。
此外,建议在此处使用 visibility_of_element_located
预期条件,而不是 presence_of_elements_located
,因为即使网页元素最终未在页面上呈现,前一个条件也已满足,而 visibility_of_element_located
预期条件等待更成熟网络元素的状态
WebDriverWait(driver, 60).until(EC.visibility_of_element_located((By.XPATH, '//tr[@class = "course-row normal faculty-BU active"]')))
time.sleep(0.4) #short delay added to make ALL the elements loaded
elements = driver.find_element(By.XPATH, '//tr[@class = "course-row normal faculty-BU active"]')
for element in elements:
element.click()
#scrape the data you need here etc
由于 <tr>
的 id
属性具有动态值来识别所有 <tr>
并单击它们中的每一个,您需要按如下方式诱导 WebDriverWait for the visibility_of_all_elements_located() and you need to construct a dynamic
使用CSS_SELECTOR:
elements = WebDriverWait(driver, 60).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tr.course-row.normal.faculty-BU.active[data-faculty_desc='Goodman School of Business'] a[data-cc][data-cid]"))) for element in elements: element.click()
使用 XPATH:
elements = WebDriverWait(driver, 60).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr[@class='course-row normal faculty-BU active' and @data-faculty_desc='Goodman School of Business']//a[@data-cc and @data-cid]"))) for element in elements: element.click()