python selenium,无法检索 xpath 的文本
python selenium, cant retrieve text of xpath
我正在努力抓取一些页面……当页面结构暗示有很多嵌套的 div 时,就会发生这种情况……
这是代码页:
<div>
<section class="ui-accordion-header ui-state-default ui-corner-all ui-accordion-icons" role="tab" id="ui-id-1" aria-controls="ui-id-2" aria-selected="false" aria-expanded="false" tabindex="0"><span class="ui-accordion-header-icon ui-icon ui-icon-triangle-1-e"></span>
<div class="detail-avocat">
<div class="nom-avocat">Me <span class="avocat_name">NAME </span></div>
<div class="type-avocat">Avocat postulant au Tribunal Judiciaire</div>
</div>
<div class="more-info">Plus d'informations</div>
</section>
<div class="ui-accordion-content ui-helper-reset ui-widget-content ui-corner-bottom" style="display: none;" id="ui-id-2" aria-labelledby="ui-id-1" role="tabpanel" aria-hidden="true">
<div class="details">
<div class="detail-avocat-row ">
<div class="detail-avocat-content overflow-h">
<span>Structure :</span>
<div>
<p>Cabinet individuel NAME</p>
</div>
</div>
</div>
<div class="detail-avocat-row ">
<div class="detail-avocat-content overflow-h">
<span>Adresse :</span>
<div>
<p>21 rue Belle Isle 57000 VILLE</p>
</div>
</div>
</div>
<div class="detail-avocat-row ">
<div class="detail-avocat-content overflow-h">
<span>Mail :</span>
<div>
<p>cabinet@mail.fr</p>
</div>
</div>
</div>
<div class="detail-avocat-row">
<div class="detail-avocat-content overflow-h">
<span>Tél :</span>
<div>
<p>Telnum</p>
</div>
</div>
</div>
<div class="detail-avocat-row">
<div class="detail-avocat-content overflow-h">
<span>Fax :</span>
<div>
<p> </p>
</div>
</div>
</div>
<div class="contact-avocat"> <a href="mailto:cabinet@mail.fr">Contacter</a> </div>
</div>
</div>
</div>
这是我的 python 代码:
divtel = self.driver.find_elements(by=By.XPATH,
value=f'//div[@class="detail-avocat-content overflow-h"]/div/p')#div[@class="detail-avocat-content overflow-h"]')
for p in divtel:
print(p.text)
它不打印任何东西...与其他类似的页面一起打印文本,但在这种情况下,尽管嵌套范围和 div/p 中根本没有文本。你知道为什么吗?
请问我该如何解决我的问题?
谢谢
方法.text
仅当包含文本的网络元素在网页中可见时才有效。如果 webelement 被隐藏,则必须使用 .get_attribute('innerText')
或 .get_attribute('textContent')
或 .get_attribute('innerHTML')
(请参阅 here 以了解它们之间的区别)。因此,例如更改
print(p.text)
到
print(p.get_attribute('innerText'))
我正在努力抓取一些页面……当页面结构暗示有很多嵌套的 div 时,就会发生这种情况…… 这是代码页:
<div>
<section class="ui-accordion-header ui-state-default ui-corner-all ui-accordion-icons" role="tab" id="ui-id-1" aria-controls="ui-id-2" aria-selected="false" aria-expanded="false" tabindex="0"><span class="ui-accordion-header-icon ui-icon ui-icon-triangle-1-e"></span>
<div class="detail-avocat">
<div class="nom-avocat">Me <span class="avocat_name">NAME </span></div>
<div class="type-avocat">Avocat postulant au Tribunal Judiciaire</div>
</div>
<div class="more-info">Plus d'informations</div>
</section>
<div class="ui-accordion-content ui-helper-reset ui-widget-content ui-corner-bottom" style="display: none;" id="ui-id-2" aria-labelledby="ui-id-1" role="tabpanel" aria-hidden="true">
<div class="details">
<div class="detail-avocat-row ">
<div class="detail-avocat-content overflow-h">
<span>Structure :</span>
<div>
<p>Cabinet individuel NAME</p>
</div>
</div>
</div>
<div class="detail-avocat-row ">
<div class="detail-avocat-content overflow-h">
<span>Adresse :</span>
<div>
<p>21 rue Belle Isle 57000 VILLE</p>
</div>
</div>
</div>
<div class="detail-avocat-row ">
<div class="detail-avocat-content overflow-h">
<span>Mail :</span>
<div>
<p>cabinet@mail.fr</p>
</div>
</div>
</div>
<div class="detail-avocat-row">
<div class="detail-avocat-content overflow-h">
<span>Tél :</span>
<div>
<p>Telnum</p>
</div>
</div>
</div>
<div class="detail-avocat-row">
<div class="detail-avocat-content overflow-h">
<span>Fax :</span>
<div>
<p> </p>
</div>
</div>
</div>
<div class="contact-avocat"> <a href="mailto:cabinet@mail.fr">Contacter</a> </div>
</div>
</div>
</div>
这是我的 python 代码:
divtel = self.driver.find_elements(by=By.XPATH,
value=f'//div[@class="detail-avocat-content overflow-h"]/div/p')#div[@class="detail-avocat-content overflow-h"]')
for p in divtel:
print(p.text)
它不打印任何东西...与其他类似的页面一起打印文本,但在这种情况下,尽管嵌套范围和 div/p 中根本没有文本。你知道为什么吗?
请问我该如何解决我的问题? 谢谢
方法.text
仅当包含文本的网络元素在网页中可见时才有效。如果 webelement 被隐藏,则必须使用 .get_attribute('innerText')
或 .get_attribute('textContent')
或 .get_attribute('innerHTML')
(请参阅 here 以了解它们之间的区别)。因此,例如更改
print(p.text)
到
print(p.get_attribute('innerText'))