特定元素的 Xpath
Xpath for specific elements
我无法从此网页抓取特定元素文本:
https://www.oddsportal.com/soccer/africa/africa-cup-of-nations/benin-togo-IsfnZDFd/
这是存档结果中特定比赛的 url,我需要从该页面上的 4 家博彩公司获取赔率。我有数以千计的比赛 url's 我想抓取。代码如下所示:
这是我试图找到博彩公司赔率但不起作用的方法:
pjs <- wdman::phantomjs()
eCap <- list(phantomjs.page.settings.userAgent
= "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:29.0) Gecko/20120101
Firefox/29.0", phantomjs.page.settings.loadImages = FALSE, phantomjs.phantom.cookiesEnabled = TRUE, phantomjs.phantom.javascriptEnabled = TRUE)
remDr <- remoteDriver(browserName = "phantomjs", port = 4567L, extraCapabilities = eCap)
remDr$open()
remDr$navigate("https://www.oddsportal.com/soccer/africa/africa-cup-of-nations/benin-togo-IsfnZDFd/")
match<-remDr$findElement('xpath','//*[@id="col-content"]/h1')
result<-remDr$findElement('xpath', '//*[@id="event-status"]/p/strong')
odds<-remDr$findElements('xpath', '//*[@class="name" and contains(text(), "18Bet")]')
odds1 <- data.frame(odds = unlist(sapply(odds, function(x){x$getElementText()})))
pjs$stop()
我想要的是最后 div 的 3 赔率,但是页面上有很多不同的博彩公司,我只能 select 所有博彩公司的赔率,我的目标是select 确切的博彩公司赔率,但我不确定如何实现这一点,因为在 div 的赔率中没有关于博彩公司的信息。
<tr class="lo odd">
<td>
<div class="l">
<a class="name2" title="Go to 18bet website!" onclick="return !window.open(this.href)" href="/bookmaker/18bet/link/"><span class="blogos l416"></span></a>
<a class="name" title="Go to 18bet website!" onclick="return !window.open(this.href)" href="/bookmaker/18bet/link/">18bet</a>
</div>
<span class="ico-bookmarker-info ico-bookmaker-detail">
<a title="Show more details about 18bet" href="/bookmaker/18bet/"></a>
</span>
<span class="ico-bookmarker-info ico-bookmaker-bonus">
<a onmouseout="globals.getBookmaker(416).cancelBonusOver();" xparam="<div class="bold">100% Bonus up to 100€!</div><div>100% first deposit bonus up to 100€! Promocode: WSB100</div>~3" onmouseover="globals.getBookmaker(416).trackBonusOver()" onclick="globals.getBookmaker(416).trackBonusClick();return !window.open(this.href);" href="/bookmaker/18bet/bonus/252"></a>
</span>
</td>
<td class="right odds">
<div onmouseout="delayHideTip()" onmouseover="page.hist(this,'P-0.00-0-0','2mlnbxv464x0x65lst',416,event,0,1)">2.05</div>
</td>
<td class="right odds up">
<div onmouseout="delayHideTip()" onmouseover="page.hist(this,'P-0.00-0-0','2mlnbxv498x0x0',416,event,0,1)">3.20</div>
</td>
<td class="right odds">
<div onmouseout="delayHideTip()" onmouseover="page.hist(this,'P-0.00-0-0','2mlnbxv464x0x65lsu',416,event,0,1)">3.50</div>
</td>
<td class="center info-value"><span>92.1%</span></td>
<td onmouseout="delayHideTip()" class="check ch3" xparam="The match has already started~2"></td>
</tr>
提前感谢您的回复。
此处 xpath 博彩公司的示例 select tr
- 18bet .
1。使用 class=name
和 text="18bet"
查找 a
,使用 class=lo
获取父级 tr
:
//a[@class="name" and .="18bet"]/ancestor::tr[contains(@class, "lo")]
2。使用 class=lo
查找 tr
,使用 class=name
和 text="18bet"
:
查找子 a
//tr[contains(@class, "lo") and .//a[@class="name" and .="18bet"]]
1
奇数: //a[@class="name" and .="18bet"]/ancestor::tr[contains(@class, "lo")]//td[2]
X
奇数: //a[@class="name" and .="18bet"]/ancestor::tr[contains(@class, "lo")]//td[3]
2
奇数: //a[@class="name" and .="18bet"]/ancestor::tr[contains(@class, "lo")]//td[4]
Payout
奇数: //a[@class="name" and .="18bet"]/ancestor::tr[contains(@class, "lo")]//td[5]
Python 代码示例:
row = driver.find_element_by_xpath('//a[@class="name" and .="18bet"]/ancestor::tr[contains(@class, "lo")]')
odd_1 = row.find_element_by_xpath('.//td[2]')
odd_x = row.find_element_by_xpath('.//td[3]')
odd_2 = row.find_element_by_xpath('.//td[4]')
odd_payout = row.find_element_by_xpath('.//td[5]')
我无法从此网页抓取特定元素文本:
https://www.oddsportal.com/soccer/africa/africa-cup-of-nations/benin-togo-IsfnZDFd/
这是存档结果中特定比赛的 url,我需要从该页面上的 4 家博彩公司获取赔率。我有数以千计的比赛 url's 我想抓取。代码如下所示:
这是我试图找到博彩公司赔率但不起作用的方法:
pjs <- wdman::phantomjs()
eCap <- list(phantomjs.page.settings.userAgent
= "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:29.0) Gecko/20120101
Firefox/29.0", phantomjs.page.settings.loadImages = FALSE, phantomjs.phantom.cookiesEnabled = TRUE, phantomjs.phantom.javascriptEnabled = TRUE)
remDr <- remoteDriver(browserName = "phantomjs", port = 4567L, extraCapabilities = eCap)
remDr$open()
remDr$navigate("https://www.oddsportal.com/soccer/africa/africa-cup-of-nations/benin-togo-IsfnZDFd/")
match<-remDr$findElement('xpath','//*[@id="col-content"]/h1')
result<-remDr$findElement('xpath', '//*[@id="event-status"]/p/strong')
odds<-remDr$findElements('xpath', '//*[@class="name" and contains(text(), "18Bet")]')
odds1 <- data.frame(odds = unlist(sapply(odds, function(x){x$getElementText()})))
pjs$stop()
我想要的是最后 div 的 3 赔率,但是页面上有很多不同的博彩公司,我只能 select 所有博彩公司的赔率,我的目标是select 确切的博彩公司赔率,但我不确定如何实现这一点,因为在 div 的赔率中没有关于博彩公司的信息。
<tr class="lo odd">
<td>
<div class="l">
<a class="name2" title="Go to 18bet website!" onclick="return !window.open(this.href)" href="/bookmaker/18bet/link/"><span class="blogos l416"></span></a>
<a class="name" title="Go to 18bet website!" onclick="return !window.open(this.href)" href="/bookmaker/18bet/link/">18bet</a>
</div>
<span class="ico-bookmarker-info ico-bookmaker-detail">
<a title="Show more details about 18bet" href="/bookmaker/18bet/"></a>
</span>
<span class="ico-bookmarker-info ico-bookmaker-bonus">
<a onmouseout="globals.getBookmaker(416).cancelBonusOver();" xparam="<div class="bold">100% Bonus up to 100€!</div><div>100% first deposit bonus up to 100€! Promocode: WSB100</div>~3" onmouseover="globals.getBookmaker(416).trackBonusOver()" onclick="globals.getBookmaker(416).trackBonusClick();return !window.open(this.href);" href="/bookmaker/18bet/bonus/252"></a>
</span>
</td>
<td class="right odds">
<div onmouseout="delayHideTip()" onmouseover="page.hist(this,'P-0.00-0-0','2mlnbxv464x0x65lst',416,event,0,1)">2.05</div>
</td>
<td class="right odds up">
<div onmouseout="delayHideTip()" onmouseover="page.hist(this,'P-0.00-0-0','2mlnbxv498x0x0',416,event,0,1)">3.20</div>
</td>
<td class="right odds">
<div onmouseout="delayHideTip()" onmouseover="page.hist(this,'P-0.00-0-0','2mlnbxv464x0x65lsu',416,event,0,1)">3.50</div>
</td>
<td class="center info-value"><span>92.1%</span></td>
<td onmouseout="delayHideTip()" class="check ch3" xparam="The match has already started~2"></td>
</tr>
提前感谢您的回复。
此处 xpath 博彩公司的示例 select tr
- 18bet .
1。使用 class=name
和 text="18bet"
查找 a
,使用 class=lo
获取父级 tr
:
//a[@class="name" and .="18bet"]/ancestor::tr[contains(@class, "lo")]
2。使用 class=lo
查找 tr
,使用 class=name
和 text="18bet"
:
a
//tr[contains(@class, "lo") and .//a[@class="name" and .="18bet"]]
1
奇数: //a[@class="name" and .="18bet"]/ancestor::tr[contains(@class, "lo")]//td[2]
X
奇数: //a[@class="name" and .="18bet"]/ancestor::tr[contains(@class, "lo")]//td[3]
2
奇数: //a[@class="name" and .="18bet"]/ancestor::tr[contains(@class, "lo")]//td[4]
Payout
奇数: //a[@class="name" and .="18bet"]/ancestor::tr[contains(@class, "lo")]//td[5]
Python 代码示例:
row = driver.find_element_by_xpath('//a[@class="name" and .="18bet"]/ancestor::tr[contains(@class, "lo")]')
odd_1 = row.find_element_by_xpath('.//td[2]')
odd_x = row.find_element_by_xpath('.//td[3]')
odd_2 = row.find_element_by_xpath('.//td[4]')
odd_payout = row.find_element_by_xpath('.//td[5]')