使用 pythons 解析标准
Using pythons parse with criteria
首先我要说的是,我对任何类型的编码都没有什么经验,所以即使我也不完全知道我在做什么,但我会尽力而为!
我一直在编写这段代码,它获取某个网站的 HTML,然后给我命名元素 (?) 的 .CSV 文件(您可以在网站的检查面板中看到这些).
所以我的问题是,如何在我当前的代码中使用条件,以便我可以告诉代码只包含 return 个单词,例如其中包含字母 g?
我很乐意详细说明!
已经谢谢你了!
import urllib.request
from bs4 import BeautifulSoup
import csv
url = 'https://kouluruoka.fi/menu/kouvola_koulujenruokalista'
request = urllib.request.Request(url)
content = urllib.request.urlopen(request)
parse = BeautifulSoup(content, 'html.parser')
#These texts get words in <h2> and <span> named elements
text1 = parse.find_all('h2')
text2 = parse.find_all('span')
#This code uses the texts above to create the .CSV file
with open('index.csv', 'a') as csv_file:
writer = csv.writer(csv_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_ALL)
for col1,col2 in zip(text1, text2):
writer.writerow([col1.get_text().strip(), col2.get_text().strip()])
您可以通过这种方式检查元素是否包含一些 string/letter:
h2_elements = parse.find_all('h2')
span_elements = parse.find_all('span')
# This code uses the texts above to create the .CSV file
with open('index.csv', 'a') as csv_file:
writer = csv.writer(csv_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_ALL)
for h2_element, span_element in zip(h2_elements, span_elements):
h2_element_str = h2_element.get_text().strip()
span_element_str = span_element.get_text().strip()
if 'a' in h2_element_str and 'a' in span_element_str:
writer.writerow([h2_element_str, span_element_str])
首先我要说的是,我对任何类型的编码都没有什么经验,所以即使我也不完全知道我在做什么,但我会尽力而为!
我一直在编写这段代码,它获取某个网站的 HTML,然后给我命名元素 (?) 的 .CSV 文件(您可以在网站的检查面板中看到这些).
所以我的问题是,如何在我当前的代码中使用条件,以便我可以告诉代码只包含 return 个单词,例如其中包含字母 g?
我很乐意详细说明! 已经谢谢你了!
import urllib.request
from bs4 import BeautifulSoup
import csv
url = 'https://kouluruoka.fi/menu/kouvola_koulujenruokalista'
request = urllib.request.Request(url)
content = urllib.request.urlopen(request)
parse = BeautifulSoup(content, 'html.parser')
#These texts get words in <h2> and <span> named elements
text1 = parse.find_all('h2')
text2 = parse.find_all('span')
#This code uses the texts above to create the .CSV file
with open('index.csv', 'a') as csv_file:
writer = csv.writer(csv_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_ALL)
for col1,col2 in zip(text1, text2):
writer.writerow([col1.get_text().strip(), col2.get_text().strip()])
您可以通过这种方式检查元素是否包含一些 string/letter:
h2_elements = parse.find_all('h2')
span_elements = parse.find_all('span')
# This code uses the texts above to create the .CSV file
with open('index.csv', 'a') as csv_file:
writer = csv.writer(csv_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_ALL)
for h2_element, span_element in zip(h2_elements, span_elements):
h2_element_str = h2_element.get_text().strip()
span_element_str = span_element.get_text().strip()
if 'a' in h2_element_str and 'a' in span_element_str:
writer.writerow([h2_element_str, span_element_str])