Beautifulsoup extracking attribute TypeError: 'NoneType' object is not iterable

Question

我以前通过这个获取标签属性

for a in soup.find_all('img', {'data-event': 'Clicked image'}, 
src=True,alt=True):
    itemobj = a['src'] + ' --- ' + a['alt']

现在我正在另一个网站上工作，当我尝试这个时它抛出类型错误：'NoneType' 对象不可迭代

 song_link = line.find('td').find('a')['href'] (This works well)
 sss = line.find('span')['title'] (This in not working. But when I delete ['title'] part it works and shows inside of the <span> tag

我的数据：

<span class="rating" title="4.5">
          <span class="icon-rating-sm icon-rating-sm__active"></span>
          <span class="icon-rating-sm icon-rating-sm__active"></span>
          <span class="icon-rating-sm icon-rating-sm__active"></span>
          <span class="icon-rating-sm icon-rating-sm__active"></span>
          <span class="icon-rating-sm icon-rating-sm__half"></span>
 </span>

我一直在寻找解决方案，但到目前为止 none 对我有用。

Answer 1

当我在你提供的数据上尝试你的代码时，它对我来说工作正常，所以我假设有更多的数据。

soup.find('span')['title']

用 "span" 检查它找到的第一个东西，如果它不包含标题标签，它会抛出异常。

例如在

<span></span>
<span class="rating" title="4.5">
      <span class="icon-rating-sm icon-rating-sm__active"></span>
      <span class="icon-rating-sm icon-rating-sm__active"></span>
      <span class="icon-rating-sm icon-rating-sm__active"></span>
      <span class="icon-rating-sm icon-rating-sm__active"></span>
      <span class="icon-rating-sm icon-rating-sm__half"></span>
</span>

代码无效。

至少我遇到过几次这种情况。

Answer 2

自从这个极其不明确的问题以来已经过去了将近三年，我不知何故直到今天才忘记它。我看到这个问题有很多人提出，我想提供我的意见，我相信绝对可以解决您代码中的问题。

不幸的是，我不记得我的代码有什么问题，我写的所有东西都很不清楚。但是，我确实有一些可能导致问题的想法。这是我给你的建议。

1.仔细阅读文档：

BeaututifulSoup 有一个写得很好的文档，如果你不知道如何使用选择器并寻找一个懒惰的答案，就像我一样，我强烈建议你看看这里的 BS4 文档：https://www.crummy.com/software/BeautifulSoup/bs4/doc/#（尤其是阅读选择器，因为误用CSS选择器导致的问题最多）。不要花 5 分钟寻找答案，而是花 10 分钟来了解它是如何工作的。我向你保证，这将对你有益得多。

2。确保你有正确的对象

通过运行宁print(dir(your_object))你可以看到你的对象可以运行的所有方法。此外，每当您遇到困难时，请尝试调试代码并找出错误。当时我正在使用 IDLE 编辑器，但最近我意识到 VS Code 有一个 built-in Python debugger，它 非常有用 ，它可以并且将会解决你的问题 99 % 的时间。

3。确保你得到正确的项目

正如@Friedrich Staufenbiel 指出的，我的数据很可能包含一个额外的 <span> 元素，如下所示

<span></span>
<span class="rating" title="4.5">
    <span class="icon-rating-sm icon-rating-sm__active"></span>
    <span class="icon-rating-sm icon-rating-sm__active"></span>
    <span class="icon-rating-sm icon-rating-sm__active"></span>
    <span class="icon-rating-sm icon-rating-sm__active"></span>
    <span class="icon-rating-sm icon-rating-sm__half"></span>
</span>

这很可能是导致问题的原因，但是我想指出，我正在寻找 line 变量中的 <span> 元素，这很可能是 a 的元素带有 for 循环迭代的列表。在这种情况下，预计程序会崩溃，因为并非所有程序都具有 <span> 元素。你能做的最好的事情就是把导致问题的代码部分放在 try-except 块中，像这样

try:
    sss = line.find('span')['title']
except Exception as e:
    print(e)

至少这样你可以找到导致程序崩溃的代码部分，你可以将错误信息提供给其他人，以便他们更好地帮助你。

我 100% 确定如果您遵循我上面提到的事情，您将解决遇到的任何问题，此外，它们是您应该拥有的良好工程实践。特别感谢@Friedrich Staufenbiel 和@matusf 的关注。

Beautifulsoup extracking attribute TypeError: 'NoneType' object is not iterable

Beautifulsoup extracking attribute TypeError: 'NoneType' object is not iterable

python

beautifulsoup

python-3.6