findChildren() 方法存储两个相同的 child 而不是一个

findChildren() method is storing two of the same child rather than one

从我用 urllib2 打开并用 BeautifulSoup 抓取的网页,我试图从网页中存储特定文本。

在您看到代码之前,这里是 link 到网页 HTML 的屏幕截图,以便您了解我使用 find 函数的方式 BeautifulSoup:

HTML from webpage

最后,这是我使用的代码:

from BeautifulSoup import BeautifulSoup
import urllib2

url = 'http://www.sciencekids.co.nz/sciencefacts/animals/bird.html'
page = urllib2.urlopen(url)
soup = BeautifulSoup(page.read())

ul = soup.find('ul', {'class': 'style33'})
children = ul.findChildren()
for child in children:
    print child.text

这是我的问题所在的输出:

Birds have feathers, wings, lay eggs and are warm blooded.
Birds have feathers, wings, lay eggs and are warm blooded.
There are around 10000 different species of birds worldwide.
There are around 10000 different species of birds worldwide.
The Ostrich is the largest bird in the world. It also lays the largest eggs and has the  fastest maximum running speed (97 kph).
The Ostrich is the largest bird in the world. It also lays the largest eggs and has the  fastest maximum running speed (97 kph).
Scientists believe that birds evolved from theropod dinosaurs.
Scientists believe that birds evolved from theropod dinosaurs.
Birds have hollow bones which help them fly.
Birds have hollow bones which help them fly.
Some bird species are intelligent enough to create and use tools.
Some bird species are intelligent enough to create and use tools.
The chicken is the most common species of bird found in the world.
The chicken is the most common species of bird found in the world.
Kiwis are endangered, flightless birds that live in New Zealand. They lay the largest eggs relative to their body size of any bird in the world.
Kiwis are endangered, flightless birds that live in New Zealand. They lay the largest eggs relative to their body size of any bird in the world.
Hummingbirds can fly backwards.
Hummingbirds can fly backwards.
The Bee Hummingbird is the smallest living bird in the world, with a length of just 5 cm (2 in).
The Bee Hummingbird is the smallest living bird in the world, with a length of just 5 cm (2 in).
Around 20% of bird species migrate long distances every year.
Around 20% of bird species migrate long distances every year.
Homing pigeons are bred to find their way home from long distances away and have been used for thousands of years to carry messages.
Homing pigeons are bred to find their way home from long distances away and have been used for thousands of years to carry messages.

我的代码中是否存在我使用不正确 and/or 的错误,导致本应只有一个的地方有两个 children?创建一些额外的代码很容易,这样我就不会存储相同信息的重复项,但我宁愿以正确的方式执行此操作,这样我只得到我要查找的每个字符串之一。

children = ul.findChildren() 正在选择 <ul> 中的 <li><p>。迭代 children 会导致您打印这两个元素的 text 属性。要解决此问题,只需将 children = ul.findChildren() 更改为 children = ul.findChildren("p")