For 循环到 CSV 导致 Python 中的行不均匀
For Loop to CSV Leading to Uneven Rows in Python
仍在学习Python,如果这是一个非常明显的错误,我们深表歉意。不过,我几个小时以来一直在努力解决这个问题,我想看看是否有人能帮忙。
我在一个冰球网站上搜索了他们的溜冰鞋名称和价格,并将其写入 CSV 文件。唯一的问题是,当我将其写入 CSV 时,名称列(列为 Gear)和价格列的行未对齐。它是:
- 齿轮名称1
- 行Space
- 价格
- 行Space
- 装备名称2
最好将齿轮和价格行彼此相邻对齐。如果有帮助的话,我还附上了 link 到 CSV 图片。
import requests
from bs4 import BeautifulSoup as Soup
webpage_response = requests.get('https://www.purehockey.com/c/ice-hockey-skates-senior?')
webpage = (webpage_response.content)
parser = Soup(webpage, 'html.parser')
filename = "gear.csv"
f = open(filename, "w")
headers = "Gear, Price"
f.write(headers)
for gear in parser.find_all("div", {"class": "details"}):
gearname = gear.find_all("div", {"class": "name"}, "a")
gearnametext = gearname[0].text
gearprice = gear.find_all("div", {"class": "price"}, "a")
gearpricetext = gearprice[0].text
print (gearnametext)
print (gearpricetext)
f.write(gearnametext + "," + gearpricetext)
[不均匀的行是什么样的][1]
[1]: https://i.stack.imgur.com/EG2f2.png
我注意到字符串中有 gearnametext
returns 2\n
。您应该尝试方法 str.replace()
来删除 \n
,它会让您跳转到下一行。尝试:
import requests
from bs4 import BeautifulSoup as Soup
webpage_response = requests.get('https://www.purehockey.com/c/ice-hockey-skates-senior?')
webpage = (webpage_response.content)
parser = Soup(webpage, 'html.parser')
filename = "gear.csv"
f = open(filename, "w")
headers = "Gear, Price"
f.write(headers)
for gear in parser.find_all("div", {"class": "details"}):
gearname = gear.find_all("div", {"class": "name"}, "a")
gearnametext = gearname[0].text.replace('\n','')
gearprice = gear.find_all("div", {"class": "price"}, "a")
gearpricetext = gearprice[0].text
print (gearnametext)
print (gearpricetext)
f.write(gearnametext + "," + gearpricetext)
我在循环内更改了齿轮名称的第二行:gearnametext = gearname[0].text.replace('\n','')
。
建议 python 3 使用 with open(filename, 'w') as f:
和 strip()
您的文本,然后 write()
到您的文件。
除非您不使用 'a' 模式来追加每一行,否则您必须为您正在编写的每一行添加换行符。
例子
import requests
from bs4 import BeautifulSoup as Soup
webpage_response = requests.get('https://www.purehockey.com/c/ice-hockey-skates-senior?')
webpage = (webpage_response.content)
parser = Soup(webpage, 'html.parser')
filename = "gear1.csv"
headers = "Gear,Price\n"
with open(filename, 'w') as f:
f.write(headers)
for gear in parser.find_all("div", {"class": "details"}):
gearnametext = gear.find("div", {"class": "name"}).text.strip()
gearpricetext = gear.find("div", {"class": "price"}).text.strip()
f.write(gearnametext + "," + gearpricetext+"\n")
输出
Gear,Price
Bauer Vapor X3.7 Ice Hockey Skates - Senior,9.99
Bauer X-LP Ice Hockey Skates - Senior,9.99
Bauer Vapor Hyperlite Ice Hockey Skates - Senior,9.98 - 49.98
CCM Jetspeed FT475 Ice Hockey Skates - Senior,9.99
Bauer X-LP Ice Hockey Skates - Intermediate,9.99
...
仍在学习Python,如果这是一个非常明显的错误,我们深表歉意。不过,我几个小时以来一直在努力解决这个问题,我想看看是否有人能帮忙。
我在一个冰球网站上搜索了他们的溜冰鞋名称和价格,并将其写入 CSV 文件。唯一的问题是,当我将其写入 CSV 时,名称列(列为 Gear)和价格列的行未对齐。它是:
- 齿轮名称1
- 行Space
- 价格
- 行Space
- 装备名称2
最好将齿轮和价格行彼此相邻对齐。如果有帮助的话,我还附上了 link 到 CSV 图片。
import requests
from bs4 import BeautifulSoup as Soup
webpage_response = requests.get('https://www.purehockey.com/c/ice-hockey-skates-senior?')
webpage = (webpage_response.content)
parser = Soup(webpage, 'html.parser')
filename = "gear.csv"
f = open(filename, "w")
headers = "Gear, Price"
f.write(headers)
for gear in parser.find_all("div", {"class": "details"}):
gearname = gear.find_all("div", {"class": "name"}, "a")
gearnametext = gearname[0].text
gearprice = gear.find_all("div", {"class": "price"}, "a")
gearpricetext = gearprice[0].text
print (gearnametext)
print (gearpricetext)
f.write(gearnametext + "," + gearpricetext)
[不均匀的行是什么样的][1] [1]: https://i.stack.imgur.com/EG2f2.png
我注意到字符串中有 gearnametext
returns 2\n
。您应该尝试方法 str.replace()
来删除 \n
,它会让您跳转到下一行。尝试:
import requests
from bs4 import BeautifulSoup as Soup
webpage_response = requests.get('https://www.purehockey.com/c/ice-hockey-skates-senior?')
webpage = (webpage_response.content)
parser = Soup(webpage, 'html.parser')
filename = "gear.csv"
f = open(filename, "w")
headers = "Gear, Price"
f.write(headers)
for gear in parser.find_all("div", {"class": "details"}):
gearname = gear.find_all("div", {"class": "name"}, "a")
gearnametext = gearname[0].text.replace('\n','')
gearprice = gear.find_all("div", {"class": "price"}, "a")
gearpricetext = gearprice[0].text
print (gearnametext)
print (gearpricetext)
f.write(gearnametext + "," + gearpricetext)
我在循环内更改了齿轮名称的第二行:gearnametext = gearname[0].text.replace('\n','')
。
建议 python 3 使用 with open(filename, 'w') as f:
和 strip()
您的文本,然后 write()
到您的文件。
除非您不使用 'a' 模式来追加每一行,否则您必须为您正在编写的每一行添加换行符。
例子
import requests
from bs4 import BeautifulSoup as Soup
webpage_response = requests.get('https://www.purehockey.com/c/ice-hockey-skates-senior?')
webpage = (webpage_response.content)
parser = Soup(webpage, 'html.parser')
filename = "gear1.csv"
headers = "Gear,Price\n"
with open(filename, 'w') as f:
f.write(headers)
for gear in parser.find_all("div", {"class": "details"}):
gearnametext = gear.find("div", {"class": "name"}).text.strip()
gearpricetext = gear.find("div", {"class": "price"}).text.strip()
f.write(gearnametext + "," + gearpricetext+"\n")
输出
Gear,Price
Bauer Vapor X3.7 Ice Hockey Skates - Senior,9.99
Bauer X-LP Ice Hockey Skates - Senior,9.99
Bauer Vapor Hyperlite Ice Hockey Skates - Senior,9.98 - 49.98
CCM Jetspeed FT475 Ice Hockey Skates - Senior,9.99
Bauer X-LP Ice Hockey Skates - Intermediate,9.99
...