从此 table html python 获取数据
Getting data from this table html python
我想从此 table 中提取显示货币汇率的数据。
访问https://www.iceplc.com/travel-money/exchange-rates
我已经尝试过这种方法,但它不起作用
table_id = driver.find_element(By.ID,
'data_configuration_feeds_ct_fields_body0')
rows = table_id.find_elements(By.TAG_NAME, "tr") # get all of the
rows in the table
for row in rows:
col = row.find_elements(By.TAG_NAME, "td")[1] #note: index start from
0, 1 is col 2
print(col.text) #prints text from the element
这是html
</td>
<td valign="top" class="OuterProdCell test">
<table class="ProductCell">
<tr>
<td class="rateCountryFlag">
<ul id="prodImages">
<li>
<a href="/travel-money/buy-chilean-peso" title="Buy Chilean Peso" class="flags chilean-peso" ></a>
</li>
</ul>
</td>
<td class="ratesName">
<a href="/travel-money/buy-chilean-peso" title="Buy Chilean Peso">
Chilean Peso</a>
</td>
<td class="ratesClass">
<a class="orderText" href="/travel-money/buy-chilean-peso" title="Buy Chilean Peso">774.8540</a>
</td>
<td class="orderNow">
<ul id="prodImages">
<li>
<a class="reserveNow" href="/travel-money/buy-chilean-peso" title="Buy Chilean Peso">Order<br/>now</a>
</li>
<li>
<a href="/travel-money/buy-chilean-peso" title="Buy Chilean Peso" class="flags arrowGreen" ></a>
</li>
</ul>
</td>
</tr>
</table>
我也尝试过 python selenium 方法,但是我可以获得每一个的货币汇率但不是名称
driver.get("https://www.iceplc.com/travel-money/exchange-
rates")
rates = driver.find_elements_by_class_name("ratesClass")
for rate in rates:
print(rate.text)
如果您只是想获取汇率,那么最好使用 api,请参阅 this question。 Web scraping 使您容易受到目标网页更改的攻击,从而破坏您的代码。
如果 scraping 是你的目标,你只需要重用你的硒方法,但搜索 "ratesName" class.
例如:
driver.get("https://www.iceplc.com/travel-money/exchange-rates")
rates.append( (driver.find_elements_by_class_name("ratesName"), driver.find_elements_by_class_name("ratesClass")) )
for rate in rates:
print( "Name: %s, Rate: %s" % (rate[0], rate[1]) )
通过分析页面的结构,很明显您必须逐行分析,并且必须select您感兴趣的列组件。
对于每一行,使用 find_element_by_tag_name
和 find_element_by_class_name
提取您感兴趣的两个元素
(此处的文档 http://selenium-python.readthedocs.io/locating-elements.html)
driver.get("https://www.iceplc.com/travel-money/exchange-rates")
rates=driver.find_elements_by_tag_name('tr')
for i in rates:
print i.find_element_by_class_name('ratesName').text, i.find_element_by_class_name('ratesClass').text
输出为:
US - Dollar 1.2536
Croatia - Kuna 8.3997
Canada - Dollar 1.7006
Australia - Dollar 1.6647
Euro - 1.1469
...
我想从此 table 中提取显示货币汇率的数据。
访问https://www.iceplc.com/travel-money/exchange-rates
我已经尝试过这种方法,但它不起作用
table_id = driver.find_element(By.ID,
'data_configuration_feeds_ct_fields_body0')
rows = table_id.find_elements(By.TAG_NAME, "tr") # get all of the
rows in the table
for row in rows:
col = row.find_elements(By.TAG_NAME, "td")[1] #note: index start from
0, 1 is col 2
print(col.text) #prints text from the element
这是html
</td>
<td valign="top" class="OuterProdCell test">
<table class="ProductCell">
<tr>
<td class="rateCountryFlag">
<ul id="prodImages">
<li>
<a href="/travel-money/buy-chilean-peso" title="Buy Chilean Peso" class="flags chilean-peso" ></a>
</li>
</ul>
</td>
<td class="ratesName">
<a href="/travel-money/buy-chilean-peso" title="Buy Chilean Peso">
Chilean Peso</a>
</td>
<td class="ratesClass">
<a class="orderText" href="/travel-money/buy-chilean-peso" title="Buy Chilean Peso">774.8540</a>
</td>
<td class="orderNow">
<ul id="prodImages">
<li>
<a class="reserveNow" href="/travel-money/buy-chilean-peso" title="Buy Chilean Peso">Order<br/>now</a>
</li>
<li>
<a href="/travel-money/buy-chilean-peso" title="Buy Chilean Peso" class="flags arrowGreen" ></a>
</li>
</ul>
</td>
</tr>
</table>
我也尝试过 python selenium 方法,但是我可以获得每一个的货币汇率但不是名称
driver.get("https://www.iceplc.com/travel-money/exchange-
rates")
rates = driver.find_elements_by_class_name("ratesClass")
for rate in rates:
print(rate.text)
如果您只是想获取汇率,那么最好使用 api,请参阅 this question。 Web scraping 使您容易受到目标网页更改的攻击,从而破坏您的代码。
如果 scraping 是你的目标,你只需要重用你的硒方法,但搜索 "ratesName" class.
例如:
driver.get("https://www.iceplc.com/travel-money/exchange-rates")
rates.append( (driver.find_elements_by_class_name("ratesName"), driver.find_elements_by_class_name("ratesClass")) )
for rate in rates:
print( "Name: %s, Rate: %s" % (rate[0], rate[1]) )
通过分析页面的结构,很明显您必须逐行分析,并且必须select您感兴趣的列组件。
对于每一行,使用 find_element_by_tag_name
和 find_element_by_class_name
(此处的文档 http://selenium-python.readthedocs.io/locating-elements.html)
driver.get("https://www.iceplc.com/travel-money/exchange-rates")
rates=driver.find_elements_by_tag_name('tr')
for i in rates:
print i.find_element_by_class_name('ratesName').text, i.find_element_by_class_name('ratesClass').text
输出为:
US - Dollar 1.2536
Croatia - Kuna 8.3997
Canada - Dollar 1.7006
Australia - Dollar 1.6647
Euro - 1.1469
...