'NoneType' 对象在尝试检索 Table 数据时没有属性 'text'

'NoneType' object has no attribute 'text' when attempting to retrieve Table Data

人们已经问过 100 次类似的问题,但 none 的解决方案正在解决我的问题!我创建了一个 html 文档,我正在托管 github,上面有一个 table! table 将用于存储玩家用户名、密码和用户 ID。网络抓取在我正在处理的整个其他项目上进展顺利,但在这里不起作用!

下面我放了 Python 脚本,它正在抓取我创建的网站!

from bs4 import BeautifulSoup
import requests


def getData():
    url = 'https://galaxy-indie-studio.github.io/Galaxy-Indie-Studio-Website/database.html'
    html_url = requests.get(url).text
    soup = BeautifulSoup(html_url, "lxml")

    database = soup.find_all('tr', class_="Player")

    for Players in database:
        username = Players.find('td', class_="Username").text 
        password = Players.find('td', class_="Password").text
        userID = Players.find('td', class_="UserID").text

        print(f"Username in database {username}")
        print(f"Password in database {password}")
        print(f"UserID in database {userID}")

如果我将 .text 留在我收到的任何变量的末尾,我会收到 AttributeError: 'NoneType' object has no attribute。如果我删除 .text it returns 为 None,用户名、密码和用户 ID 相同

在下面,我输入了我目前用于存储 table 的网站的代码!

<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8">
        <meta name="viewport" content="width=device-width, initial-scale=1">
        <title>Player Database</title>
    </head>
    <body>
        <table border = 4px, bgcolor="black", width= 100%> 
            <tr>
                <th width="150" height="20" bgcolor="lightgray">Username</th>
                <th width="150" height="20" bgcolor="lightgray">Password</th>
                <th width="150" height="20" bgcolor="lightgray">UserID</th>
            </tr>
            <tr class="Player", width="150" height="20">
                <td align="center" bgcolor="lightgray",class="Username">BigTall12</td>
                <td align = "center" bgcolor="lightgray",class="Password"></td>
                <td align="center" bgcolor="lightgray",class="UserID"></td>
            </tr>

            <tr class="Player",width="150" height="20">
                <td align="center" bgcolor="lightgray",class="Username"></td>
                <td align = "center" bgcolor="lightgray",class="Password"></td>
                <td align = "center" bgcolor="lightgray",class="UserID"></td>
            </tr>

            <tr class="Player",width="150" height="20">
                <td align="center" bgcolor="lightgray",class="Username"></td>
                <td align="center" bgcolor="lightgray",class="Password"></td>
                <td align="center"bgcolor="lightgray",class="UserID"></td>
            </tr>

            <tr class="Player",width="150" height="20">
                <td align="center" bgcolor="lightgray",class="Username"></td>
                <td align="center" bgcolor="lightgray",class="Password"></td>
                <td align="center" bgcolor="lightgray",class="UserID"></td>
            </tr>

        </table>
    </body>
</html>

那是因为您在 <td> 标签中没有 class 属性。你确实有一个 ,class 属性,bs4 不会识别它。

所以我的意思是,您的 html 是错误的。去掉源 html.

中 class 属性前的那些逗号

例如:

`<td align="center" bgcolor="lightgray",class="Username">BigTall12</td>` 

应该是

`<td align="center" bgcolor="lightgray" class="Username">BigTall12</td>`

或者,在阅读 html 后修复它:

import requests
from bs4 import BeautifulSoup

def getData():
    url = 'https://galaxy-indie-studio.github.io/Galaxy-Indie-Studio-Website/database.html'
    html_url = requests.get(url).text
    html_url = html_url.replace(',class', ' class')
    
    soup = BeautifulSoup(html_url, "lxml")

    database = soup.find_all('tr', class_="Player")

    for Players in database:
        username = Players.find('td', class_="Username").text 
        password = Players.find('td', class_="Password").text
        userID = Players.find('td', class_="UserID").text

        print(f"Username in database {username}")
        print(f"Password in database {password}")
        print(f"UserID in database {userID}")
        
        
getData()