Python :Page Navigator Maximum Value Scraper - 只获取最后一个值的输出
Python :Page Navigator Maximum Value Scraper - Only getting the output of last value
这是我创建的程序,用于从 list.I 的每个类别部分中提取最大页面值我无法获取所有值,我只是获取中最后一个值的值list.What 我需要进行更改才能获得所有输出。
import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
#List for extended links to the base url
links = ['Link_1/','Link_2/','Link_3/']
#Function to find out the biggest number present in the page navigation
#section.Every element before 'Next→' is consist of the upper limit
def page_no():
bs = soup(page_html, "html.parser")
max_page = bs.find('a',{'class':'next page-numbers'}).findPrevious().text
print(max_page)
#url loop
for url in links:
my_urls ='http://example.com/category/{}/'.format(url)
# opening up connection,grabbing the page
uClient = uReq(my_urls)
page_html = uClient.read()
uClient.close()
page_no()
页面导航器示例:
1 2 3 … 15 Next →
提前致谢
您需要将 page_html 放在函数内部并缩进最后 4 行。此外,最好 return max_page 值,以便您可以在函数外使用它。
def page_no(page_html):
bs = soup(page_html, "html.parser")
max_page = bs.find('a',{'class':'next page-numbers'}).findPrevious().text
return max_page
#url loop
for url in links:
my_urls='http://example.com/category/{}/'.format(url)
# opening up connection,grabbing the page
uClient = uReq(my_urls)
page_html = uClient.read()
uClient.close()
max_page = page_no(page_html)
print(max_page)
这是我创建的程序,用于从 list.I 的每个类别部分中提取最大页面值我无法获取所有值,我只是获取中最后一个值的值list.What 我需要进行更改才能获得所有输出。
import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
#List for extended links to the base url
links = ['Link_1/','Link_2/','Link_3/']
#Function to find out the biggest number present in the page navigation
#section.Every element before 'Next→' is consist of the upper limit
def page_no():
bs = soup(page_html, "html.parser")
max_page = bs.find('a',{'class':'next page-numbers'}).findPrevious().text
print(max_page)
#url loop
for url in links:
my_urls ='http://example.com/category/{}/'.format(url)
# opening up connection,grabbing the page
uClient = uReq(my_urls)
page_html = uClient.read()
uClient.close()
page_no()
页面导航器示例:
1 2 3 … 15 Next →
提前致谢
您需要将 page_html 放在函数内部并缩进最后 4 行。此外,最好 return max_page 值,以便您可以在函数外使用它。
def page_no(page_html):
bs = soup(page_html, "html.parser")
max_page = bs.find('a',{'class':'next page-numbers'}).findPrevious().text
return max_page
#url loop
for url in links:
my_urls='http://example.com/category/{}/'.format(url)
# opening up connection,grabbing the page
uClient = uReq(my_urls)
page_html = uClient.read()
uClient.close()
max_page = page_no(page_html)
print(max_page)