该网站有 9 页,我的代码只是将最后一页元素添加到列表中

The website has 9 pages and my code just add the last page elements to the list

网站有 9 页,我的代码只是将最后一页元素添加到列表中。我想将所有页面的所有元素一起添加到列表中。

alltitles = []
allnames = []
alllinks = []
allpeices = []
allstocks = []
for n in range(pagenum):
    pages_url = f"https://www.ispsupplies.com/manufacturers/TP~Link?order=relevance:asc&page= 
    {n+1}&keywords=tp-link"
    driver.get(pages_url)
    html = driver.page_source
    soup = Soup(html)
    title = soup.find_all("span", itemprop="name")
    titleloop = [titles.text for titles in title]
    alltitles.append(titleloop)
    name = soup.find_all("div", class_="item-details-sku-container")
    nameloop = [names.text for names in name]
    allnames.append(nameloop)
    link = soup.find_all("a", class_="facets-item-cell-grid-title")
    linkloop = [links.text for links in link]
    alllinks.append(linkloop)
    price = soup.find_all("span", class_="item-views-price-lead")
    priceloop = [prices.text for prices in price]
    allpeices.append(priceloop)
    stock = soup.find_all("div", class_="item-details-stock")
    stockloop = [stocks.text for stocks in stock]
    allstocks.append(stockloop)

会发生什么?

代码运行良好,但迭代速度很快,当您尝试查找时,您查找的元素不存在。

如何修复?

使用selenium waits检查DOM中是否存在元素:

...
driver.get(pages_url)
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, '[data-type="item"]')))
html = driver.page_source
...

注意:您必须进行额外的导入

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

例子

不确定为什么要决定这些列表,这个例子处理单个字典列表:

data = []

for n in range(2):
    pages_url = f"https://www.ispsupplies.com/manufacturers/TP~Link?order=relevance:asc&page={n+1}&keywords=tp-link"
    driver.get(pages_url)
    WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, '[data-type="item"]')))
    html = driver.page_source
    soup = Soup(html)
    
    for item in soup.select('[data-type="item"]'):
        data.append({
            'title' : item.find("span", itemprop="name").text,
            'name' : item.find("div", class_="item-details-sku-container").text,
            'link' : item.find("a", class_="facets-item-cell-grid-title")['href'],
            'price' : item.find("span", class_="item-views-price-lead").text,
            'stock' : item.find("div", class_="item-details-stock").text.strip()
        })
        
pd.DataFrame(data)

输出

title name link price stock
TP-Link AC750 Wireless Dual Band Router SKU: Archer C20 /TP-Link-Archer-C20 US$‎34.99 Direct Ship item Item usually ships directly from the manufacturer
TP-Link 16-Port Gigabit Unmanaged Pro Switch SKU: TL-SG116E /TP-Link-TL-SG116E US$‎79.99 3 In Stock
TP-Link AC1200 Wireless MU-MIMO Gigabit Router Archer A6 SKU: Archer A6_V3 /TP-Link-Archer-A6 US$‎49.99 Direct Ship item Item usually ships directly from the manufacturer
TP-Link AC4000 MU-MIMO Tri-Band Wi-Fi Router Archer A20 SKU: Archer A20 /TP-Link-Archer-A20 US$‎189.99 Direct Ship item Item usually ships directly from the manufacturer
TP-Link AC5400 MU-MIMO Tri-Band Gaming Router SKU: Archer C5400X /TP-Link-Archer-C5400X US$‎279.99 Direct Ship item Item usually ships directly from the manufacturer

有什么理由不通过api?效率更高,您将获得更多数据。您始终可以过滤掉不需要的列。

import requests
import pandas as pd

items = []
page = 0
while True:
    url = 'https://www.ispsupplies.com/api/items'
    payload = {
    '_t': '1641815468877',
    'c': '393682',
    'country': 'US',
    'currency': 'USD',
    'custitem_disable_from_main_website': '0',
    'custitem_is_international': '0',
    'fieldset': 'search',
    'include': 'facets',
    'language': 'en',
    'limit': '100',
    'manufacturers': 'TP~Link',
    'n': '2',
    'nocache': 'T',
    'offset': str(page*100),
    'sort': 'quantityavailable:desc'}


    jsonData = requests.get(url, params=payload).json()
    
    items += jsonData['items']
    print('Page: %s' %(page+1))
    
    if len(jsonData['items']) < 100:
        break
    page += 1
    
df = pd.DataFrame(items)

输出:

完整输出(仅 199 个产品的前 5 行):

print(df.head(5).to_string())
  custitem88 custitem89 custitem83  custitem_is_international custitem_open_box_ids custitem_ns_pr_item_attributes  custitemnew  ispurchasable custitem_ns_pr_attributes_rating stockdescription  custitemclearance                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      itemimages_detail custitem_commercecategory_brand custitemwarehousemessage  custitem_incanada                                                    onlinecustomerprice_detail custitem71  weight custitem_ns_pr_rating_by_rate  internalid                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     itemoptions_detail outofstockmessage                                                                     custitemextralargeimage2 custitem_availableus                                              storedescription pricelevel1_formatted  isinstock custitem67  custitem20  custitem21  onlinecustomerprice  dontshowprice  custitemrefurbished  custitemonsale custitem68 manufacturer  custitem69  custitemfree_shipping         itemid  custitemondiscount  offersupport onlinecustomerprice_formatted nopricemessage  custitem_disable_from_main_website pricelevel66_formatted  isbackorderable  custitemtariff_item  custitemfree_shipping_cw                                                       custitem93 custitem94  custitem19  custitem18 custitem_st7 custitem_st6  showoutofstockmessage outofstockbehavior custitem_st8  itemtype  quantityavailable custitem_st3 custitem_st2 custitem_st5 displayname                                    storedisplayname2 custitem_st4 custitem_availableca  pricelevel1 custitem_st1  custitem_gpon                                         urlcomponent  pricelevel66 custitem_commerce_category_1 custitem_commerce_category_3 custitem_commerce_category_2
0                     0                                 False                                               &nbsp;        False           True                                                                False                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          {'5366': {'urls': [{'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-LINK-Gigabit-PCI-Express-Network-Adapter-TG-3468.5366-2.jpg'}]}}                         TP-Link                11/8/2021              False  {'onlinecustomerprice_formatted': 'US$‎14.99', 'onlinecustomerprice': 14.99}               0.50                                      5366  {'fields': [{'internalid': 'custcol19', 'label': 'Item Length', 'type': 'float'}, {'internalid': 'custcol20', 'label': 'Item Width', 'type': 'float'}, {'internalid': 'custcol21', 'label': 'Item Height', 'type': 'float'}, {'internalid': 'custcol_tariff_fee_option', 'label': 'Tariff Fee', 'type': 'currency'}, {'internalid': 'custcol_tariff_fee', 'label': 'Tariff Fee Custom', 'type': 'currency'}, {'internalid': 'custcol_is_tariff', 'label': 'Is Tariff', 'type': 'checkbox'}, {'internalid': 'custcol26', 'label': 'Purchase Price', 'type': 'currency'}, {'internalid': 'custcol36', 'label': 'Not Kit Component', 'type': 'checkbox'}, {'internalid': 'custcol67', 'label': 'Is Tariff (Webstore)', 'type': 'text'}, {'internalid': 'custcol_shiphawk_proposed_shipment_id', 'label': 'ShipHawk Proposed Shipment ID', 'type': 'text'}, {'internalid': 'custcol_shiphawk_source_system_line_n', 'label': 'ShipHawk Source System Line Number', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier', 'label': 'Carrier Name', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier_service', 'label': 'Carrier Service', 'type': 'text'}]}                     /core/media/media.nl?id=922920&c=393682&h=1qP1ijidIPW2P4DK3Fi_jlV_N3UT-StJuJYKXsZSuMSrOrIn                  109                           32-bit Gigabit PCIe Network Adapter             US$‎14.99       True                   2.25       False                14.99          False                False           False                 TP-Link       False                   True        TG-3468               False         False                     US$‎14.99                                              False              US$‎14.99             True                False                     False  <div class="stock-detail-in ">In stock at College Station</div>                   5.50        6.25                                            False        - Default -               InvtPart              109.0                                                             TP-LINK 32-bit Gigabit PCIe Network Adapter                                          14.99                       False  TP-LINK-Gigabit-PCI-Express-Network-Adapter-TG-3468         14.99                 PCI Adapters                          NaN                          NaN
1                     0                                 False                                               &nbsp;        False           True                                                                False                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      {'urls': [{'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-LINK-TL-PA4010-KIT.01.jpg'}]}                         TP-Link                11/8/2021              False  {'onlinecustomerprice_formatted': 'US$‎39.99', 'onlinecustomerprice': 39.99}               1.00                                      5406  {'fields': [{'internalid': 'custcol19', 'label': 'Item Length', 'type': 'float'}, {'internalid': 'custcol20', 'label': 'Item Width', 'type': 'float'}, {'internalid': 'custcol21', 'label': 'Item Height', 'type': 'float'}, {'internalid': 'custcol_tariff_fee_option', 'label': 'Tariff Fee', 'type': 'currency'}, {'internalid': 'custcol_tariff_fee', 'label': 'Tariff Fee Custom', 'type': 'currency'}, {'internalid': 'custcol_is_tariff', 'label': 'Is Tariff', 'type': 'checkbox'}, {'internalid': 'custcol26', 'label': 'Purchase Price', 'type': 'currency'}, {'internalid': 'custcol36', 'label': 'Not Kit Component', 'type': 'checkbox'}, {'internalid': 'custcol67', 'label': 'Is Tariff (Webstore)', 'type': 'text'}, {'internalid': 'custcol_shiphawk_proposed_shipment_id', 'label': 'ShipHawk Proposed Shipment ID', 'type': 'text'}, {'internalid': 'custcol_shiphawk_source_system_line_n', 'label': 'ShipHawk Source System Line Number', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier', 'label': 'Carrier Name', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier_service', 'label': 'Carrier Service', 'type': 'text'}]}                     /core/media/media.nl?id=875835&c=393682&h=blNs8_wT0YD2isH8-8LHyXuDVz82k4V5VxMsQVeVrrUeVsAE                   94  AV500 Nano Powerline Ethernet Adapter Starter Kit, Twin Pack             US$‎39.99       True                   4.00       False                39.99          False                False           False                 TP-Link       False                   True  TL-PA4010 KIT               False         False                     US$‎39.99                                              False              US$‎39.99             True                False                     False  <div class="stock-detail-in ">In stock at College Station</div>                   6.00        8.00                                            False        - Default -               InvtPart               94.0                                                                     TP-LINK AV600 Powerline Starter Kit                                          39.99                       False                                TP-LINK-TL-PA4010-KIT         39.99            Powerline Systems                          NaN                          NaN
2                     0                                 False                                               &nbsp;        False           True                                                                False  {'urls': [{'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-UE300.01.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-UE300.02.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-UE300.03.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-UE300.04.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-UE300.05.jpg'}]}                         TP-Link                9/17/2021              False  {'onlinecustomerprice_formatted': 'US$‎12.99', 'onlinecustomerprice': 12.99}               0.25                                     20996  {'fields': [{'internalid': 'custcol19', 'label': 'Item Length', 'type': 'float'}, {'internalid': 'custcol20', 'label': 'Item Width', 'type': 'float'}, {'internalid': 'custcol21', 'label': 'Item Height', 'type': 'float'}, {'internalid': 'custcol_tariff_fee_option', 'label': 'Tariff Fee', 'type': 'currency'}, {'internalid': 'custcol_tariff_fee', 'label': 'Tariff Fee Custom', 'type': 'currency'}, {'internalid': 'custcol_is_tariff', 'label': 'Is Tariff', 'type': 'checkbox'}, {'internalid': 'custcol26', 'label': 'Purchase Price', 'type': 'currency'}, {'internalid': 'custcol36', 'label': 'Not Kit Component', 'type': 'checkbox'}, {'internalid': 'custcol67', 'label': 'Is Tariff (Webstore)', 'type': 'text'}, {'internalid': 'custcol_shiphawk_proposed_shipment_id', 'label': 'ShipHawk Proposed Shipment ID', 'type': 'text'}, {'internalid': 'custcol_shiphawk_source_system_line_n', 'label': 'ShipHawk Source System Line Number', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier', 'label': 'Carrier Name', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier_service', 'label': 'Carrier Service', 'type': 'text'}]}                    /core/media/media.nl?id=7189171&c=393682&h=qYPfPWXvWc_Udet9IChlyz96qbiA25Y-jMsjg8svIFm-WHxm                   79                                                                           US$‎12.99       True                   0.67       False                12.99          False                False           False                 TP-Link       False                  False          UE300               False         False                     US$‎12.99                                              False              US$‎12.99             True                False                     False  <div class="stock-detail-in ">In stock at College Station</div>                   3.35        6.10                                            False        - Default -               InvtPart               79.0                                                     TP-Link USB 3.0 to Gigabit Ethernet Network Adapter                                          12.99                       False                                        TP-Link-UE300         12.99               USB Converters                          NaN                          NaN
3                     0                                 False                                               &nbsp;        False           True                                                                False                                                                                                                                                                                                                          {'urls': [{'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-LINK-2-4GHz-300Mbps-9dBi-Outdoor-CPE-CPE210.001.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-LINK-2-4GHz-300Mbps-9dBi-Outdoor-CPE-CPE210.002.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-LINK-2-4GHz-300Mbps-9dBi-Outdoor-CPE-CPE210.003.jpg'}]}                         TP-Link                9/22/2021              False  {'onlinecustomerprice_formatted': 'US$‎39.99', 'onlinecustomerprice': 39.99}               1.65                                      5319  {'fields': [{'internalid': 'custcol19', 'label': 'Item Length', 'type': 'float'}, {'internalid': 'custcol20', 'label': 'Item Width', 'type': 'float'}, {'internalid': 'custcol21', 'label': 'Item Height', 'type': 'float'}, {'internalid': 'custcol_tariff_fee_option', 'label': 'Tariff Fee', 'type': 'currency'}, {'internalid': 'custcol_tariff_fee', 'label': 'Tariff Fee Custom', 'type': 'currency'}, {'internalid': 'custcol_is_tariff', 'label': 'Is Tariff', 'type': 'checkbox'}, {'internalid': 'custcol26', 'label': 'Purchase Price', 'type': 'currency'}, {'internalid': 'custcol36', 'label': 'Not Kit Component', 'type': 'checkbox'}, {'internalid': 'custcol67', 'label': 'Is Tariff (Webstore)', 'type': 'text'}, {'internalid': 'custcol_shiphawk_proposed_shipment_id', 'label': 'ShipHawk Proposed Shipment ID', 'type': 'text'}, {'internalid': 'custcol_shiphawk_source_system_line_n', 'label': 'ShipHawk Source System Line Number', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier', 'label': 'Carrier Name', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier_service', 'label': 'Carrier Service', 'type': 'text'}]}                     /core/media/media.nl?id=875579&c=393682&h=skaSM39aCBHsxoAkbixkUtedRt2h7qw6xp6EXKWbFg9QUAGA                   71       Outdoor 2.4GHz 300Mbps High power Wireless Access Point             US$‎39.99       True                   4.10       False                39.99          False                False           False                 TP-Link       False                   True         CPE210               False         False                     US$‎39.99                                              False              US$‎39.99             True                False                     False  <div class="stock-detail-in ">In stock at College Station</div>                   5.25       10.62                                            False        - Default -               InvtPart               71.0                                                          TP-LINK 2.4GHz 300Mbps 9dBi Outdoor CPE CPE210                                          39.99                       False       TP-LINK-2-4GHz-300Mbps-9dBi-Outdoor-CPE-CPE210         39.99                2GHz PTP/PTMP                          NaN                          NaN
4                     0                                 False                                               &nbsp;        False           True                                                                False                                                                                                                                                                                                                                                                                                              {'urls': [{'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-TL-WR902AC.011.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-TL-WR902AC.012.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-TL-WR902AC.013.jpg'}]}                         TP-Link               11/29/2021              False  {'onlinecustomerprice_formatted': 'US$‎39.99', 'onlinecustomerprice': 39.99}               0.60                                      5512  {'fields': [{'internalid': 'custcol19', 'label': 'Item Length', 'type': 'float'}, {'internalid': 'custcol20', 'label': 'Item Width', 'type': 'float'}, {'internalid': 'custcol21', 'label': 'Item Height', 'type': 'float'}, {'internalid': 'custcol_tariff_fee_option', 'label': 'Tariff Fee', 'type': 'currency'}, {'internalid': 'custcol_tariff_fee', 'label': 'Tariff Fee Custom', 'type': 'currency'}, {'internalid': 'custcol_is_tariff', 'label': 'Is Tariff', 'type': 'checkbox'}, {'internalid': 'custcol26', 'label': 'Purchase Price', 'type': 'currency'}, {'internalid': 'custcol36', 'label': 'Not Kit Component', 'type': 'checkbox'}, {'internalid': 'custcol67', 'label': 'Is Tariff (Webstore)', 'type': 'text'}, {'internalid': 'custcol_shiphawk_proposed_shipment_id', 'label': 'ShipHawk Proposed Shipment ID', 'type': 'text'}, {'internalid': 'custcol_shiphawk_source_system_line_n', 'label': 'ShipHawk Source System Line Number', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier', 'label': 'Carrier Name', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier_service', 'label': 'Carrier Service', 'type': 'text'}]}                    /core/media/media.nl?id=1056731&c=393682&h=che7-nic7o8Sln8Cl1UJWkH_DVUv7VRlcJi9_va_9WP4bFwv                   60                  AC750 Portable Wi-Fi Travel Router, 2.4/5GHz             US$‎39.99       True                   3.00       False                39.99          False                False           False                 TP-Link       False                   True     TL-WR902AC               False         False                     US$‎39.99                                              False              US$‎39.99             True                False                     False  <div class="stock-detail-in ">In stock at College Station</div>                   4.50        4.50                                            False        - Default -               InvtPart               60.0                                                           TP-Link AC750 Wireless Travel Router 2.4/5GHz                                          39.99                       False                                   TP-Link-TL-WR902AC         39.99             Wireless Routers                          NaN    

或者只是在网站上看到的内容:

print(df[['storedisplayname2', 
          'itemid', 
          'urlcomponent',
          'onlinecustomerprice_formatted',
          'quantityavailable']].head(5).to_string())


                                     storedisplayname2         itemid                                         urlcomponent onlinecustomerprice_formatted  quantityavailable
0          TP-LINK 32-bit Gigabit PCIe Network Adapter        TG-3468  TP-LINK-Gigabit-PCI-Express-Network-Adapter-TG-3468                     US$‎14.99              109.0
1                  TP-LINK AV600 Powerline Starter Kit  TL-PA4010 KIT                                TP-LINK-TL-PA4010-KIT                     US$‎39.99               94.0
2  TP-Link USB 3.0 to Gigabit Ethernet Network Adapter          UE300                                        TP-Link-UE300                     US$‎12.99               79.0
3       TP-LINK 2.4GHz 300Mbps 9dBi Outdoor CPE CPE210         CPE210       TP-LINK-2-4GHz-300Mbps-9dBi-Outdoor-CPE-CPE210                     US$‎39.99               71.0
4        TP-Link AC750 Wireless Travel Router 2.4/5GHz     TL-WR902AC                                   TP-Link-TL-WR902AC                     US$‎39.99               60.0