当 Python 正在写入 CSV 时,脚本会在 csv 文件的 try / except 块中插入新行
While Python is writing to CSV, the script is inserting new line in try / except block in the csv file
美好的一天,
我是 Python 和 Selenium 的新手,需要帮助解决以下问题:
我的代码片段如下:
num_page_items = len(date)
blank = "0"
try:
with open('results.csv', 'a') as f:
for i in range(num_page_items):
f.write(name[i].text + "#" + surname[i].text + "#" + ref[i].text + "#" + url[i].text + "\n")
except IndexError:
with open('results.csv', 'a') as f:
f.write(blank)
我有一些变量正在使用 selenium 抓取网站。
数据示例和预期输出如下:
Name: Joe Surname: Soap Ref: 1234 URL: www.example.com
Name: Bill Surname: Smith Ref: 4567 URL: www.dot.com
expected output
当所有元素都存在时,Python 脚本运行良好,但是当一个元素(在示例中:第二个条目中不存在 Ref)不存在时,输出如下
output when an element doesn't exist
如果网页上不存在该变量,我该怎么做才能将变量设置为 "Null",因此预期的新输出如下:
expected output when element doesn't exist
Just as a side note, the error I receive isn't a Selenium exception,
but is an IndexError, hence the use of the "IndexError" except
statement
编辑 - Felipe Gutierrez 的建议
Felipe 建议的更大的代码片段:
for url in links:
driver.get(url) #goes to the array and opens each link
company = driver.find_elements_by_xpath("""//*[contains(@id, 'node')]/div[2]/ul/li/div/div[1]/span""")
date = driver.find_elements_by_xpath("""//*[contains(@id, 'node')]/div[1]/div[1]/div[2]/div/span""")
ref = driver.find_elements_by_xpath("""//*[contains(@id, 'node')]/div[1]/div[3]""")
title = driver.find_elements_by_xpath("""//*[@id="page-title"]/span""")
urlinf = driver.current_url
num_page_items = len(date)
blank = "blank"
for ref in ref:
if ref is None:
ref = 0
with open('results.csv', 'a') as f:
for i in range(num_page_items):
f.write(company[i].text + "#" + date[i].text + "#" + ref[i].text + "#" + title[i].text + "#" + urlinf + "\n")
driver.close()
我现在收到以下错误:
Traceback (most recent call last): File "accc_for_loop_nest.py",
line 50, in
f.write(company[i].text + "#" + date[i].text + "#" + ref[i].text + "#" + title[i].text + "#" + urlinf + "\n") TypeError: 'WebElement'
object does not support indexing
您丢失了正在使用 try-catch 迭代的列表的索引,您可以尝试在插入循环之前测试 IndexError 值,并在该特定位置为列表分配一个零。比没有异常处理的插入。
类似于:
for url in links:
driver.get(url) #goes to the array and opens each link
company = driver.find_elements_by_xpath("""//*[contains(@id, 'node')]/div[2]/ul/li/div/div[1]/span""")
date = driver.find_elements_by_xpath("""//*[contains(@id, 'node')]/div[1]/div[1]/div[2]/div/span""")
ref = driver.find_elements_by_xpath("""//*[contains(@id, 'node')]/div[1]/div[3]""")
title = driver.find_elements_by_xpath("""//*[@id="page-title"]/span""")
urlinf = driver.current_url
num_page_items = len(date)
blank = "blank"
companyStrings = []
dateStrings = []
refStrings = []
titleStrings = []
with open('results.csv', 'a') as f:
for i in range(num_page_items):
companyStrings.append( company[i].text )
dateStrings.append( date[i].text )
refStrings.append( ref[i].text )
titleStrings.append( title[i].text )
if companyStrings[i] == '':
companyStrings[i] = '0'
if dateStrings[i] = '':
dateStrings[i] = '0'
if refStrings[i] == '':
refStrings[i] = '0'
if titleStrings[i] == '':
titleStrings[i] = '0'
f.write(companyStrings[i] + "#" + dateStrings[i] + "#" + refStrings[i] + "#" + titleStrings[i] + "#" + urlinf + "\n")
driver.close()
美好的一天,
我是 Python 和 Selenium 的新手,需要帮助解决以下问题:
我的代码片段如下:
num_page_items = len(date)
blank = "0"
try:
with open('results.csv', 'a') as f:
for i in range(num_page_items):
f.write(name[i].text + "#" + surname[i].text + "#" + ref[i].text + "#" + url[i].text + "\n")
except IndexError:
with open('results.csv', 'a') as f:
f.write(blank)
我有一些变量正在使用 selenium 抓取网站。 数据示例和预期输出如下:
Name: Joe Surname: Soap Ref: 1234 URL: www.example.com
Name: Bill Surname: Smith Ref: 4567 URL: www.dot.com
expected output
当所有元素都存在时,Python 脚本运行良好,但是当一个元素(在示例中:第二个条目中不存在 Ref)不存在时,输出如下
output when an element doesn't exist
如果网页上不存在该变量,我该怎么做才能将变量设置为 "Null",因此预期的新输出如下:
expected output when element doesn't exist
Just as a side note, the error I receive isn't a Selenium exception, but is an IndexError, hence the use of the "IndexError" except statement
编辑 - Felipe Gutierrez 的建议
Felipe 建议的更大的代码片段:
for url in links:
driver.get(url) #goes to the array and opens each link
company = driver.find_elements_by_xpath("""//*[contains(@id, 'node')]/div[2]/ul/li/div/div[1]/span""")
date = driver.find_elements_by_xpath("""//*[contains(@id, 'node')]/div[1]/div[1]/div[2]/div/span""")
ref = driver.find_elements_by_xpath("""//*[contains(@id, 'node')]/div[1]/div[3]""")
title = driver.find_elements_by_xpath("""//*[@id="page-title"]/span""")
urlinf = driver.current_url
num_page_items = len(date)
blank = "blank"
for ref in ref:
if ref is None:
ref = 0
with open('results.csv', 'a') as f:
for i in range(num_page_items):
f.write(company[i].text + "#" + date[i].text + "#" + ref[i].text + "#" + title[i].text + "#" + urlinf + "\n")
driver.close()
我现在收到以下错误:
Traceback (most recent call last): File "accc_for_loop_nest.py", line 50, in f.write(company[i].text + "#" + date[i].text + "#" + ref[i].text + "#" + title[i].text + "#" + urlinf + "\n") TypeError: 'WebElement' object does not support indexing
您丢失了正在使用 try-catch 迭代的列表的索引,您可以尝试在插入循环之前测试 IndexError 值,并在该特定位置为列表分配一个零。比没有异常处理的插入。 类似于:
for url in links:
driver.get(url) #goes to the array and opens each link
company = driver.find_elements_by_xpath("""//*[contains(@id, 'node')]/div[2]/ul/li/div/div[1]/span""")
date = driver.find_elements_by_xpath("""//*[contains(@id, 'node')]/div[1]/div[1]/div[2]/div/span""")
ref = driver.find_elements_by_xpath("""//*[contains(@id, 'node')]/div[1]/div[3]""")
title = driver.find_elements_by_xpath("""//*[@id="page-title"]/span""")
urlinf = driver.current_url
num_page_items = len(date)
blank = "blank"
companyStrings = []
dateStrings = []
refStrings = []
titleStrings = []
with open('results.csv', 'a') as f:
for i in range(num_page_items):
companyStrings.append( company[i].text )
dateStrings.append( date[i].text )
refStrings.append( ref[i].text )
titleStrings.append( title[i].text )
if companyStrings[i] == '':
companyStrings[i] = '0'
if dateStrings[i] = '':
dateStrings[i] = '0'
if refStrings[i] == '':
refStrings[i] = '0'
if titleStrings[i] == '':
titleStrings[i] = '0'
f.write(companyStrings[i] + "#" + dateStrings[i] + "#" + refStrings[i] + "#" + titleStrings[i] + "#" + urlinf + "\n")
driver.close()