使用 selenium 进行网页抓取后,我的 csv 文件中出现了奇怪的结果。这些内容没有具体内容,而是 html 代码

After web scraping using selenium, I got weird results in my csv file.. Instead of having specific contents, the contents are html codes

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By

import time
import csv


driver = webdriver.Chrome('/Users/myname/Desktop/web_crawling/chromedriver')

driver.get('https://www.naver.com')
time.sleep(2)


driver.find_element(by=By.CSS_SELECTOR, value='a.nav.shop').click()

search = driver.find_element(by=By.CSS_SELECTOR,value='._searchInput_search_input_QXUFf')
search.click()

search.send_keys("아이폰 13")
search.send_keys(Keys.ENTER)

before_h = driver.execute_script("return window.scrollY")

while True:
    driver.find_element(by=By.CSS_SELECTOR, value='body').send_keys(Keys.END)
    time.sleep(1)

    after_h = driver.execute_script("return window.scrollY")

    if after_h == before_h:
        break
    before_h = after_h

#create csv file
f = open(r"/Users/yungijeong/Desktop/web_crawling/data.csv", 'w', encoding='UTF8')
csvWriter = csv.writer(f)

items = driver.find_elements(by=By.CSS_SELECTOR, value=".basicList_info_area__17Xyo")

for item in items:
    names = item.find_elements(by=By.CSS_SELECTOR,  value=".basicList_link__1MaTN")
    for name in names:
        print(name.text)

    try:
        prices = item.find_elements(by=By.CSS_SELECTOR, value=".price_num__2WUXn")
        for price in prices:
           print(price.text)
    except:
        print("판매중단")
    links = item.find_elements(by=By.CSS_SELECTOR, value=".basicList_title__3P9Q7 > a")
    for link in links:
        print(link.get_attribute('href'))
    print(name, price, link)

    #adding inside the csv files

    csvWriter.writerow([name, price, link])

f.close()

在这里,我试图在 Koran 购物网站上抓取 iPhone 的详细信息和价格。我编写了代码,以便 webdriver 自动进入站点并获取所有详细信息(例如产品的价格和 link)。最后,它应该制作一个 csv 文件并将所有抓取的数据粘贴到那里。

代码运行完美,但是当我将它导出到 csv 文件时,它看起来像这样:The result in csv

内容似乎没有正确导出。每个代码看起来都像 HTML 代码...你们中有人遇到过同样的问题吗?在终端中,看起来 webdrvier 正确区分了数据,但结果却以一种奇怪的方式导出数据。如果您有同样的问题,请分享!!

我认为所有的问题是你 print() 值但你没有分配给变量。

你有

print(name.text)
print(price.text)
print(link.get_attribute('href'))

但是你忘记了

name = name.text
price = price.text
link = link.get_attribute('href')

或者你应该写

csvWriter.writerow([name.text, link.text, link.get_attribute('href')])