如何从特定的 `<ul>` 中获取嵌套的 `<a>` 的 `href` 属性到列表中？

Question

我有一个 <ul> 和 xpath:position = //ul[5] 其中包含一些 <a>.

第一个 <a> 有 xpath:position = //ul[5]/li/div/div/a，下一个 <a> 有 xpath:position = //ul[5]/li[2]/div/div/a，下一个 xpath:position = //ul[5]/li[3]/div/div/a 继续...

因此，对于此 <ul> 中的每个新 <a>，<a> 的 xpath:position 在 <li> 之后得到一个 [#]。

我需要的是一个示例，说明我将如何计算此特定 <ul> 中存在多少 <a>，然后获取每个 <a> 的 href 属性进入列表。

我试过这个：

    WebDriver driver = DriverFactory.getWebDriver()
    def aCount = driver.findElements(By.xpath("//ul[5]/li/div/div/a")).size()
    println aCount

但它计算了页面的所有 <a>，而不仅仅是 <ul> 和 xpath:position = //ul[5]!!!

Answer 1

使用绝对 xpath 可以减少测试 htmlchangeproof，最好避免这些。

您只需要以下组合：

使用 element.findElements(By.by)
查找子元素By.tagName(String tagName)

代码示例：

package tests;

import java.util.ArrayList;
import java.util.List;

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;

import selenium.ChromeDriverSetup;

public class CollectHrefsTest extends ChromeDriverSetup {

    public static void main(String[] args) {
        
        List<String> hrefs = new ArrayList<String>();
        WebDriver driver = startChromeDriver(); // wrapped driver init
        driver.get("https://www.whosebug.com");
        List<WebElement> ulTags = driver.findElements(By.tagName("ul"));
        for (WebElement ulTag: ulTags) {
            List<WebElement> liTags = ulTag.findElements(By.tagName("li"));
            for (WebElement liTag: liTags) {
                List<WebElement> aTags = liTag.findElements(By.tagName("a"));
                for (WebElement aTag: aTags) {
                    String href = aTag.getAttribute("href");
                    if (href != null) {
                        hrefs.add(href);
                        System.out.println(href);
                    }
                    else {
                        System.out.println("href is null");
                    }
                }
            }
        }
        System.out.println("hrefs collected: " + hrefs.size());
        driver.quit();
    }

}

输出：

Starting ChromeDriver 97.0.4692.71 (adefa7837d02a07a604c1e6eff0b3a09422ab88d-refs/branch-heads/4692@{#1247}) on port 13301
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
[1644849838.445][WARNING]: This version of ChromeDriver has not been tested with Chrome version 98.
Úno 14, 2022 3:43:58 ODP. org.openqa.selenium.remote.ProtocolHandshake createSession
INFO: Detected dialect: W3C
https://whosebug.com/
https://whosebug.com/help
https://chat.whosebug.com/?tab=site&host=whosebug.com
https://meta.whosebug.com/
https://whosebug.com/questions
https://whosebug.com/jobs
https://whosebug.com/jobs/directory/developer-jobs
https://whosebug.com/jobs/salary
https://whosebug.com/help
href is null
href is null
https://whosebug.com/teams
https://whosebug.com/talent
https://whosebug.com/advertising
https://Whosebugsolutions.com/explore-teams
https://Whosebug.co/
https://Whosebug.co/company/press
https://Whosebug.co/company/work-here
https://whosebug.com/legal
https://whosebug.com/legal/privacy-policy
https://whosebug.com/legal/terms-of-service
https://Whosebug.co/company/contact
https://whosebug.com/#
https://whosebug.com/legal/cookie-policy
https://stackexchange.com/sites#technology
https://stackexchange.com/sites#culturerecreation
https://stackexchange.com/sites#lifearts
https://stackexchange.com/sites#science
https://stackexchange.com/sites#professional
https://stackexchange.com/sites#business
https://api.stackexchange.com/
https://data.stackexchange.com/
https://Whosebug.blog/?blb=1
https://www.facebook.com/officialWhosebug/
https://twitter.com/Whosebug
https://linkedin.com/company/stack-overflow
https://www.instagram.com/theWhosebug
hrefs collected: 35

Answer 2

所有 <a> 都在其祖先 <li> 内，所有 <li> 都在 //ul[5] 内。所以解决方案是遍历所有 <li>s，你可以使用下面的 :

WebDriver driver = DriverFactory.getWebDriver()
def aCount = driver.findElements(By.xpath("//ul[5]//li/div/div/a")).size()
                      //note the double slash here ^
println aCount

Answer 3

问题是 //ul[5] 中有两种 <a>。 //ul[5]/li/div/div/a 和 //ul[5]/li/div/div[2]/a.

在第一种情况下，包裹 <a> 的 <div> 具有 class 名称 (div[@class="heading-4"]/a[1])。在第二种情况下，包裹 <a> 的 <div> 具有 class 名称 (div[@class="heading-4-sub"]/a[1]).

当我数 <a> 时，我得到了两种 <a>。

所以我不得不这样做：

WebDriver driver = DriverFactory.getWebDriver()

List<String> hrefs = []
List<WebElement> aTags = driver.findElements(By.xpath('//ul[5]/li/div/div[@class="heading-4"]/a'))

for (WebElement aTag in aTags) {
    String href = aTag.getAttribute("href")
    if (href != null) {
        hrefs.add(href);
    } else {
        hrefs.add('Empty Link');
    }
}

System.out.println(hrefs + "\n\nURLs Found: " + hrefs.size())

我正在使用： findElements(By.xpath("//ul[5]/li/div/div/a")) 代替： findElements(By.xpath('//ul[5]/li/div/div[@class="heading-4"]/a')) 仅获取 <a> 由 <div> 和 class name “heading-4”包裹的 <a>。

https://docs.katalon.com/katalon-studio/docs/detect_elements_xpath.html#what-is-xpath

如何从特定的 `<ul>` 中获取嵌套的 `<a>` 的 `href` 属性到列表中？

How can I get the `href` attributes of nested `<a>` from specific `<ul>` into a list?

java

selenium

swing

katalon-studio

katalon-recorder