Python: 将解析的信息返回到列表？

Question

我的代码：

from urllib2 import urlopen
from bs4 import BeautifulSoup

url = "https://realpython.com/practice/profiles.html"

html_page = urlopen(url)
html_text = html_page.read()

soup = BeautifulSoup(html_text)

links = soup.find_all('a', href = True)

files = []

def page_names():
    for a in links:
        files.append(a['href'])
        return files


page_names()

print files[:]

base = "https://realpython.com/practice/"

print base + files[:]

我试图解析出三个网页文件名并将它们附加到 "files" 列表，然后以某种方式将它们附加或添加到基础 url 的末尾以进行简单打印。

我试过制作 "base" 单个项目列表以便我可以追加，但我对 Python 很陌生并且相信我搞砸了我的 for 语句。

目前我得到：

print files[:]
TypeError: 'type' object has no attribute '__getitem__'

Answer 1

最后你定义了 list[:]，这是完全错误的，因为 list 是用于创建实际列表的内置关键字。

from urllib2 import urlopen
from bs4 import BeautifulSoup

url = "https://realpython.com/practice/profiles.html"

html_page = urlopen(url)
html_text = html_page.read()

soup = BeautifulSoup(html_text)

links = soup.find_all('a', href = True)

files = []

def page_names():
    for a in links:
        files.append(a['href'])


page_names()


base = "https://realpython.com/practice/"
for i in files:
    print base + i

输出：

https://realpython.com/practice/aphrodite.html
https://realpython.com/practice/poseidon.html
https://realpython.com/practice/dionysus.html

而且您不需要创建用于存储链接或文件的中间列表，只需使用 list_comprehension。

from urllib2 import urlopen
from bs4 import BeautifulSoup
url = "https://realpython.com/practice/profiles.html"
html_page = urlopen(url)
html_text = html_page.read()
soup = BeautifulSoup(html_text)
files = [i['href'] for i in soup.find_all('a', href = True)]
base = "https://realpython.com/practice/"
for i in files:
    print base + i

Python: 将解析的信息返回到列表？

Python: Returning parsed info to a list?

python

parsing

for-loop

return

beautifulsoup