浏览网站并提取 phone 个号码

Navigate website and extract phone numbers

我正在尝试使用 Beautiful Soup 从多个国家/地区的网站上抓取非法 phone 号码。

html 代码具有以下结构:

<div class="Table">
<div class="Title"> </div>
<div class="Row">
<div class="Cell">
<div>United Kingdom<br>
<a href="447465832167-UnitedKingdom"><img src="flags/flag-uk.png" alt="SMS - United Kingdom" style="vertical-align: middle;">&nbsp;&nbsp;+447465832167<br>
</a><strong> SMS received:23304<section style="border:none; height: auto; padding: 1px; width: auto; background: #33FF66;"></section></strong></div>
</div>
<div class="Cell">
<div>Germany<br>
<a href="4915902933699-Germany"><img src="flags/german_flag.gif" alt="SMS - Germany" style="vertical-align: middle;">&nbsp;&nbsp;+4915902933699<br>
</a><strong> SMS received:21712<section style="border:none; height: auto; padding: 1px; width: auto; background: #33FF66;"></section></strong></div>
</div>
<div class="Cell">
<div>India<br>
<a href="919532351442-India"><img src="flags/flag-india.png" alt="SMS - India" style="vertical-align: middle;">&nbsp;&nbsp;+919532351442<br>
</a><strong> SMS received:1593<section style="border:none; height: auto; padding: 1px; width: auto; background: #33FF66;"></section></strong></div>
</div>
....
</div>

您可以看到不同的国家(英国、德国……)。每个国家/地区都有一个 phone 号码,我正在尝试获取该号码。 这是我做的:

...
soup = bs.BeautifulSoup(source, parser)
mydivs = soup.findAll("div", {"class": "Cell"})

结果:

[<div class="Cell">
<div>United Kingdom<br/>
<a href="447465832167-UnitedKingdom"><img alt="SMS - United Kingdom" src="flags/flag-uk.png" style="vertical-align: middle;"/>  +447465832167<br/>
</a><strong> SMS received:23324<section style="border:none; height: auto; padding: 1px; width: auto; background: #33FF66;"></section></strong></div>
</div>, <div class="Cell">
<div>Germany<br/>
<a href="4915902933699-Germany"><img alt="SMS - Germany" src="flags/german_flag.gif" style="vertical-align: middle;"/>  +4915902933699<br/>
</a><strong> SMS received:21739<section style="border:none; height: auto; padding: 1px; width: auto; background: #33FF66;"></section></strong></div>
</div>, <div class="Cell">
<div>India<br/>
...
]

问题: 我如何从中检索 phone 号码?

快完成了,试试看:

>>> [div.a.text.strip() for div in mydivs]
['+447465832167', '+4915902933699', '+919532351442']