如何使用 Selenium 和 Python 从 table 中的 <td> 元素中提取值

Question

我正在尝试使用 Selenium 从 table 中提取值 1，但我没有找到执行此操作的好方法.

<td width="1%" style="text-align: right">1</td>

页面 HTML 如下所示：

<tr class="linhaPar" onMouseOver="javascript:this.style.backgroundColor='#C4D2EB'" onMouseOut="javascript:this.style.backgroundColor=''">
   <td>
      Scientific American
   </td>
   <td>
      A Base Molecular da Vida  Uma Introducao a Biologia Molecular
   </td>
   <td>
   </td>
   <td>
      <table width="100%">
         <tbody style="background-color: transparent;">
            <tr>
               <td>
                  1971
               </td>
            </tr>
         </tbody>
      </table>
   </td>
   <td width="1%" style="text-align: right">
      1
   </td>
   <td width="1%"> 
      <a id="formBuscaPublica:ClinkView" href="#" onclick="if(typeof jsfcljs == 'function'){jsfcljs(document.getElementById('formBuscaPublica'),{'formBuscaPublica:ClinkView':'formBuscaPublica:ClinkView','idTitulo':'39117','idsBibliotecasAcervoPublicoFormatados':'47_46','apenasSituacaoVisivelUsuarioFinal':'true'},'');}return false"><img id="formBuscaPublica:ImageView" src="/sigaa/img/view.gif" style="border:none" title="Visualizar Informa&ccedil;&otilde;es dos Materiais Informacionais" /></a>
   </td>

我试过使用这段代码，但它根本不起作用。

x = browser.find_elements_by_xpath('//*[@id="listagem"]/tbody/tr[1]/td[5]/').text

谢谢！

Answer 1

这是我的做法，我创建了一个可重复使用的函数，returns 第一个元素通过标签和匹配属性。

def getElementByTagAndAttributes(browser, tag, **kwargs):
    for element in browser.find_elements_by_tag_name(tag):
        for key, value in kwargs.items():
            attribute = element.get_attribute(key)
            if attribute != value:
                break
        else:
            return element

x = getElementByTagAndAttributes(browser, "td", width="1%", style="text-align: right").text

Answer 2

因为其 table 结构和数据以行和列表示。您可以根据某些特定数据查找值。因此，在您的情况下，假设您要检索基于 "Scientific American" 的值 1，然后使用下面的 xpath -

x = browser.find_elements_by_xpath("//tr/td[contains(.,'Scientific American')]/following-sibling::td[4]").text

Answer 3

尝试以下 xpath:

x = driver.find_element_by_xpath('//tr[@class="linhaPar" and contains(.,"Scientific American")]//td[contains(@style, "text-align")]').text
print(x)

注:

不使用.find_elements，而是.find_element

Answer 4

要从元素中提取文本 1：

<td width="1%" style="text-align: right">1</td>

您可以使用以下任一基于 xpath 的解决方案：

使用文本 科学美国人:

print(browser.find_elements_by_xpath("//td[contains(., 'Scientific American')]//following::td[3]//following-sibling::td[1]").text)

使用文本 A Base Molecular da Vida Uma Introducao a Biologia Molecular:

print(browser.find_elements_by_xpath("//td[contains(., 'A Base Molecular da Vida  Uma Introducao a Biologia Molecular')]//following::td[2]//following-sibling::td[1]").text)

如何使用 Selenium 和 Python 从 table 中的 <td> 元素中提取值

How to extract the value from the <td> element within the table using Selenium and Python

python

selenium

xpath

xpath-1.0

selenium-webdriver