如何从特定的 `<ul>` 中获取嵌套的 `<a>` 的 `href` 属性到列表中?
How can I get the `href` attributes of nested `<a>` from specific `<ul>` into a list?
我有一个 <ul>
和 xpath:position = //ul[5]
其中包含一些 <a>
.
第一个 <a>
有 xpath:position = //ul[5]/li/div/div/a
,下一个 <a>
有 xpath:position = //ul[5]/li[2]/div/div/a
,下一个 xpath:position = //ul[5]/li[3]/div/div/a
继续...
因此,对于此 <ul>
中的每个新 <a>
,<a>
的 xpath:position
在 <li>
之后得到一个 [#]
。
我需要的是一个示例,说明我将如何计算此特定 <ul>
中存在多少 <a>
,然后获取每个 <a>
的 href
属性进入列表。
我试过这个:
WebDriver driver = DriverFactory.getWebDriver()
def aCount = driver.findElements(By.xpath("//ul[5]/li/div/div/a")).size()
println aCount
但它计算了页面的所有 <a>
,而不仅仅是 <ul>
和 xpath:position = //ul[5]
!!!
使用绝对 xpath 可以减少测试 htmlchangeproof,最好避免这些。
您只需要以下组合:
- 使用
element.findElements(By.by)
处理 parrent/child 个元素
- 查找子元素
By.tagName(String tagName)
代码示例:
package tests;
import java.util.ArrayList;
import java.util.List;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import selenium.ChromeDriverSetup;
public class CollectHrefsTest extends ChromeDriverSetup {
public static void main(String[] args) {
List<String> hrefs = new ArrayList<String>();
WebDriver driver = startChromeDriver(); // wrapped driver init
driver.get("https://www.whosebug.com");
List<WebElement> ulTags = driver.findElements(By.tagName("ul"));
for (WebElement ulTag: ulTags) {
List<WebElement> liTags = ulTag.findElements(By.tagName("li"));
for (WebElement liTag: liTags) {
List<WebElement> aTags = liTag.findElements(By.tagName("a"));
for (WebElement aTag: aTags) {
String href = aTag.getAttribute("href");
if (href != null) {
hrefs.add(href);
System.out.println(href);
}
else {
System.out.println("href is null");
}
}
}
}
System.out.println("hrefs collected: " + hrefs.size());
driver.quit();
}
}
输出:
Starting ChromeDriver 97.0.4692.71 (adefa7837d02a07a604c1e6eff0b3a09422ab88d-refs/branch-heads/4692@{#1247}) on port 13301
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
[1644849838.445][WARNING]: This version of ChromeDriver has not been tested with Chrome version 98.
Úno 14, 2022 3:43:58 ODP. org.openqa.selenium.remote.ProtocolHandshake createSession
INFO: Detected dialect: W3C
https://whosebug.com/
https://whosebug.com/help
https://chat.whosebug.com/?tab=site&host=whosebug.com
https://meta.whosebug.com/
https://whosebug.com/questions
https://whosebug.com/jobs
https://whosebug.com/jobs/directory/developer-jobs
https://whosebug.com/jobs/salary
https://whosebug.com/help
href is null
href is null
https://whosebug.com/teams
https://whosebug.com/talent
https://whosebug.com/advertising
https://Whosebugsolutions.com/explore-teams
https://Whosebug.co/
https://Whosebug.co/company/press
https://Whosebug.co/company/work-here
https://whosebug.com/legal
https://whosebug.com/legal/privacy-policy
https://whosebug.com/legal/terms-of-service
https://Whosebug.co/company/contact
https://whosebug.com/#
https://whosebug.com/legal/cookie-policy
https://stackexchange.com/sites#technology
https://stackexchange.com/sites#culturerecreation
https://stackexchange.com/sites#lifearts
https://stackexchange.com/sites#science
https://stackexchange.com/sites#professional
https://stackexchange.com/sites#business
https://api.stackexchange.com/
https://data.stackexchange.com/
https://Whosebug.blog/?blb=1
https://www.facebook.com/officialWhosebug/
https://twitter.com/Whosebug
https://linkedin.com/company/stack-overflow
https://www.instagram.com/theWhosebug
hrefs collected: 35
所有 <a>
都在其祖先 <li>
内,所有 <li>
都在 //ul[5]
内。所以解决方案是遍历所有 <li>
s,你可以使用下面的 :
WebDriver driver = DriverFactory.getWebDriver()
def aCount = driver.findElements(By.xpath("//ul[5]//li/div/div/a")).size()
//note the double slash here ^
println aCount
问题是 //ul[5]
中有两种 <a>
。 //ul[5]/li/div/div/a
和 //ul[5]/li/div/div[2]/a
.
在第一种情况下,包裹 <a>
的 <div>
具有 class 名称 (div[@class="heading-4"]/a[1]
)。
在第二种情况下,包裹 <a>
的 <div>
具有 class 名称 (div[@class="heading-4-sub"]/a[1]
).
当我数 <a>
时,我得到了两种 <a>
。
所以我不得不这样做:
WebDriver driver = DriverFactory.getWebDriver()
List<String> hrefs = []
List<WebElement> aTags = driver.findElements(By.xpath('//ul[5]/li/div/div[@class="heading-4"]/a'))
for (WebElement aTag in aTags) {
String href = aTag.getAttribute("href")
if (href != null) {
hrefs.add(href);
} else {
hrefs.add('Empty Link');
}
}
System.out.println(hrefs + "\n\nURLs Found: " + hrefs.size())
我正在使用:
findElements(By.xpath("//ul[5]/li/div/div/a"))
代替:
findElements(By.xpath('//ul[5]/li/div/div[@class="heading-4"]/a'))
仅获取 <a>
由 <div>
和 class name
“heading-4”包裹的 <a>
。
https://docs.katalon.com/katalon-studio/docs/detect_elements_xpath.html#what-is-xpath
我有一个 <ul>
和 xpath:position = //ul[5]
其中包含一些 <a>
.
第一个 <a>
有 xpath:position = //ul[5]/li/div/div/a
,下一个 <a>
有 xpath:position = //ul[5]/li[2]/div/div/a
,下一个 xpath:position = //ul[5]/li[3]/div/div/a
继续...
因此,对于此 <ul>
中的每个新 <a>
,<a>
的 xpath:position
在 <li>
之后得到一个 [#]
。
我需要的是一个示例,说明我将如何计算此特定 <ul>
中存在多少 <a>
,然后获取每个 <a>
的 href
属性进入列表。
我试过这个:
WebDriver driver = DriverFactory.getWebDriver()
def aCount = driver.findElements(By.xpath("//ul[5]/li/div/div/a")).size()
println aCount
但它计算了页面的所有 <a>
,而不仅仅是 <ul>
和 xpath:position = //ul[5]
!!!
使用绝对 xpath 可以减少测试 htmlchangeproof,最好避免这些。
您只需要以下组合:
- 使用
element.findElements(By.by)
处理 parrent/child 个元素
- 查找子元素
By.tagName(String tagName)
代码示例:
package tests;
import java.util.ArrayList;
import java.util.List;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import selenium.ChromeDriverSetup;
public class CollectHrefsTest extends ChromeDriverSetup {
public static void main(String[] args) {
List<String> hrefs = new ArrayList<String>();
WebDriver driver = startChromeDriver(); // wrapped driver init
driver.get("https://www.whosebug.com");
List<WebElement> ulTags = driver.findElements(By.tagName("ul"));
for (WebElement ulTag: ulTags) {
List<WebElement> liTags = ulTag.findElements(By.tagName("li"));
for (WebElement liTag: liTags) {
List<WebElement> aTags = liTag.findElements(By.tagName("a"));
for (WebElement aTag: aTags) {
String href = aTag.getAttribute("href");
if (href != null) {
hrefs.add(href);
System.out.println(href);
}
else {
System.out.println("href is null");
}
}
}
}
System.out.println("hrefs collected: " + hrefs.size());
driver.quit();
}
}
输出:
Starting ChromeDriver 97.0.4692.71 (adefa7837d02a07a604c1e6eff0b3a09422ab88d-refs/branch-heads/4692@{#1247}) on port 13301
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
[1644849838.445][WARNING]: This version of ChromeDriver has not been tested with Chrome version 98.
Úno 14, 2022 3:43:58 ODP. org.openqa.selenium.remote.ProtocolHandshake createSession
INFO: Detected dialect: W3C
https://whosebug.com/
https://whosebug.com/help
https://chat.whosebug.com/?tab=site&host=whosebug.com
https://meta.whosebug.com/
https://whosebug.com/questions
https://whosebug.com/jobs
https://whosebug.com/jobs/directory/developer-jobs
https://whosebug.com/jobs/salary
https://whosebug.com/help
href is null
href is null
https://whosebug.com/teams
https://whosebug.com/talent
https://whosebug.com/advertising
https://Whosebugsolutions.com/explore-teams
https://Whosebug.co/
https://Whosebug.co/company/press
https://Whosebug.co/company/work-here
https://whosebug.com/legal
https://whosebug.com/legal/privacy-policy
https://whosebug.com/legal/terms-of-service
https://Whosebug.co/company/contact
https://whosebug.com/#
https://whosebug.com/legal/cookie-policy
https://stackexchange.com/sites#technology
https://stackexchange.com/sites#culturerecreation
https://stackexchange.com/sites#lifearts
https://stackexchange.com/sites#science
https://stackexchange.com/sites#professional
https://stackexchange.com/sites#business
https://api.stackexchange.com/
https://data.stackexchange.com/
https://Whosebug.blog/?blb=1
https://www.facebook.com/officialWhosebug/
https://twitter.com/Whosebug
https://linkedin.com/company/stack-overflow
https://www.instagram.com/theWhosebug
hrefs collected: 35
所有 <a>
都在其祖先 <li>
内,所有 <li>
都在 //ul[5]
内。所以解决方案是遍历所有 <li>
s,你可以使用下面的
WebDriver driver = DriverFactory.getWebDriver()
def aCount = driver.findElements(By.xpath("//ul[5]//li/div/div/a")).size()
//note the double slash here ^
println aCount
问题是 //ul[5]
中有两种 <a>
。 //ul[5]/li/div/div/a
和 //ul[5]/li/div/div[2]/a
.
在第一种情况下,包裹 <a>
的 <div>
具有 class 名称 (div[@class="heading-4"]/a[1]
)。
在第二种情况下,包裹 <a>
的 <div>
具有 class 名称 (div[@class="heading-4-sub"]/a[1]
).
当我数 <a>
时,我得到了两种 <a>
。
所以我不得不这样做:
WebDriver driver = DriverFactory.getWebDriver()
List<String> hrefs = []
List<WebElement> aTags = driver.findElements(By.xpath('//ul[5]/li/div/div[@class="heading-4"]/a'))
for (WebElement aTag in aTags) {
String href = aTag.getAttribute("href")
if (href != null) {
hrefs.add(href);
} else {
hrefs.add('Empty Link');
}
}
System.out.println(hrefs + "\n\nURLs Found: " + hrefs.size())
我正在使用:
findElements(By.xpath("//ul[5]/li/div/div/a"))
代替:
findElements(By.xpath('//ul[5]/li/div/div[@class="heading-4"]/a'))
仅获取 <a>
由 <div>
和 class name
“heading-4”包裹的 <a>
。
https://docs.katalon.com/katalon-studio/docs/detect_elements_xpath.html#what-is-xpath