在 python 中双击 beautifulsoup 中的一个元素
doubleclick an element in beautifulsoup in python
我无法在 beautifulsoup 中使用 xpath 获取文本,但 selenium 可以使用双击命令获取该文本。如何使用 beautifulsoup?
获取元素
我试过:
import requests
from lxml import etree
from bs4 import BeautifulSoup
#Function to Find the element from the Xpath
def Xpath(url):
Dict_Headers = ({'User-Agent':
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 \
(KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36',\
'Accept-Language': 'en-US, en;q=0.5'})
# Gets the requried data https browser's address bar
webPage = requests.get(url,Dict_Headers)
# Creating a soup Object from the html content
Scraping = BeautifulSoup(webPage.content, "html.parser")
# Conveting Soup object to etree object for Xpath processing
documentObjectModel = etree.HTML(str(Scraping))
return (documentObjectModel.xpath("//div[@id='pages']/div/section/div[5]/div/div[3]/div[2]/div[2]")[0].text)
URL = "http://..."
print(Xpath(URL))
selenium 中的双击命令是:
selenium.doubleClick("xpath=//div[@id='pages']/div/section/div[5]/div/div[3]/div[2]/div[2]")
我想得到的东西是检查截图中的“0”:
你既不需要 selenium 也不需要 beautifulSoup。有一个 API 从中获取数据。
以下是获取 XPath 指向的部分的方法:
import requests
url = "https://servis.mgm.gov.tr/web/sondurumlar?merkezid=90601"
headers = {
"Accept": "application/json, text/plain, */*",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:100.0) Gecko/20100101 Firefox/100.0",
"Referer": "https://mgm.gov.tr/",
"Origin": "https://mgm.gov.tr",
}
data = requests.get(url, headers=headers).json()[0]
for k, v in data.items():
print(f"{k}: {v}")
输出:
aktuelBasinc: 911.6
denizSicaklik: -9999
denizeIndirgenmisBasinc: 1009
gorus: 20000
hadiseKodu: AB
istNo: 17130
kapalilik: 2
karYukseklik: -9999
nem: 25
rasatMetar: -9999
rasatSinoptik: -9999
rasatTaf: -9999
ruzgarHiz: 13.32
ruzgarYon: 36
sicaklik: 29.7
veriZamani: 2022-06-01T09:10:00.000Z
yagis00Now: 0
yagis10Dk: 0
yagis12Saat: 0
yagis1Saat: 0
yagis24Saat: 0
yagis6Saat: 0
denizVeriZamani: 2022-06-01T09:00:00.000Z
我无法在 beautifulsoup 中使用 xpath 获取文本,但 selenium 可以使用双击命令获取该文本。如何使用 beautifulsoup?
获取元素我试过:
import requests
from lxml import etree
from bs4 import BeautifulSoup
#Function to Find the element from the Xpath
def Xpath(url):
Dict_Headers = ({'User-Agent':
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 \
(KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36',\
'Accept-Language': 'en-US, en;q=0.5'})
# Gets the requried data https browser's address bar
webPage = requests.get(url,Dict_Headers)
# Creating a soup Object from the html content
Scraping = BeautifulSoup(webPage.content, "html.parser")
# Conveting Soup object to etree object for Xpath processing
documentObjectModel = etree.HTML(str(Scraping))
return (documentObjectModel.xpath("//div[@id='pages']/div/section/div[5]/div/div[3]/div[2]/div[2]")[0].text)
URL = "http://..."
print(Xpath(URL))
selenium 中的双击命令是:
selenium.doubleClick("xpath=//div[@id='pages']/div/section/div[5]/div/div[3]/div[2]/div[2]")
我想得到的东西是检查截图中的“0”:
你既不需要 selenium 也不需要 beautifulSoup。有一个 API 从中获取数据。
以下是获取 XPath 指向的部分的方法:
import requests
url = "https://servis.mgm.gov.tr/web/sondurumlar?merkezid=90601"
headers = {
"Accept": "application/json, text/plain, */*",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:100.0) Gecko/20100101 Firefox/100.0",
"Referer": "https://mgm.gov.tr/",
"Origin": "https://mgm.gov.tr",
}
data = requests.get(url, headers=headers).json()[0]
for k, v in data.items():
print(f"{k}: {v}")
输出:
aktuelBasinc: 911.6
denizSicaklik: -9999
denizeIndirgenmisBasinc: 1009
gorus: 20000
hadiseKodu: AB
istNo: 17130
kapalilik: 2
karYukseklik: -9999
nem: 25
rasatMetar: -9999
rasatSinoptik: -9999
rasatTaf: -9999
ruzgarHiz: 13.32
ruzgarYon: 36
sicaklik: 29.7
veriZamani: 2022-06-01T09:10:00.000Z
yagis00Now: 0
yagis10Dk: 0
yagis12Saat: 0
yagis1Saat: 0
yagis24Saat: 0
yagis6Saat: 0
denizVeriZamani: 2022-06-01T09:00:00.000Z