如何使用 python 和硒从地图(例如 Pokevision)中抓取 GIS 坐标?
How to scrape GIS coordinates from a map (e.g. Pokevision) using python and selenium?
我想抓取 PokemonVision 以便我可以获得正在显示的 Pokemon 的所有经度和纬度坐标。
网页的URL包含旗标的经纬度。例如下面的 url 包含 39.95142302373031,-75.17986178398132: https://pokevision.com/#/@39.95142302373031,-75.17986178398132
在source code中,标志标记有如下div:
<div class="leaflet-marker-pane"><img class="leaflet-marker-icon leaflet-zoom-animated leaflet-clickable" src="/asset/image/leaflet//marker-icon.png" style="margin-left: -12px; margin-top: -41px; width: 25px; height: 41px; transform: translate(324px, 135px); z-index: 135;" tabindex="0"/>
我还注意到显示的每个口袋妖怪都有一个 div,如下所示:
<div class="leaflet-marker-icon-wrapper leaflet-zoom-animated leaflet-clickable" style="margin-left: 0px; margin-top: 0px; transform: translate(215px, 113px); z-index: 113;" tabindex="0"><img class="leaflet-marker-icon " src="//ugc.pokevision.com/images/pokemon/116.png" style="margin-left: 0px; margin-top: 0px; width: 48px; height: 48px;"/><span class="leaflet-marker-iconlabel home-map-label" style="margin-left: 10px; margin-top: 26px;">08:15</span></div>
我假设口袋妖怪的位置和旗帜标记可以在 div 中找到,尤其是在文本 "transform: translate(" 之后。
考虑到我们知道旗帜的像素位置和经纬度,以及小精灵的像素位置,我相信我应该可以得到小精灵的经纬度。
比如旗标总是在324px、135px,我们知道旗标的gis坐标是39.95142302373031,-75.17986178398132。我们也知道宠物小精灵的坐标(例如 215px、113px)。但是,我似乎无法弄清楚如何获得口袋妖怪的经度和纬度。
如果您单击地图,URL 会更新该点的坐标。您可以在地图上找到所有可见的神奇宝贝,单击它们,然后从更新的 URL 中解析坐标。示例代码:
from pprint import pprint as pp
from selenium import webdriver
from selenium.common.exceptions import WebDriverException
poke_names = {
21: "Spearow",
23: "Ekans",
39: "Jigglypuff",
98: "Krabby",
129: "Pidgey",
}
driver = webdriver.Chrome()
try:
driver.get("https://pokevision.com/#/@39.95142302373031,-75.17986178398132")
# Zoom out once
zoom_css = "a.leaflet-control-zoom-out"
driver.find_element_by_css_selector(zoom_css).click()
# Find all pokemon in the source
poke_css = "div.leaflet-marker-pane div.leaflet-marker-icon-wrapper"
pokemon = driver.find_elements_by_css_selector(poke_css)
print("Found {0} pokemon".format(len(pokemon)))
# Filter for only the ones that are displayed on screen
on_screen_pokemon = [p for p in pokemon if p.is_displayed()]
print("There are {0} pokemon on screen".format(len(on_screen_pokemon)))
# Click each pokemon, which moves the marker and thus updates the URL with
# the coords of that pokemon
coords = list()
for pokemon in on_screen_pokemon:
try:
pokemon.click()
# Example URL: https://ugc.pokevision.com/images/pokemon/21.png
img_url = pokemon.find_element_by_css_selector('img').get_attribute("src")
img_num = int(img_url.split('.png')[0].split('/')[-1])
except WebDriverException:
# Some are hidden by other elements, move on
continue
else:
# Example
# https://pokevision.com/#/@39.95142302373031,-75.17986178398132
poke_coords = driver.current_url.split('#/@')[1].split(',')
poke_name = poke_names[img_num] if img_num in poke_names else "Unknown"
coords.append((poke_name, poke_coords))
print("Found coordinates for {0} pokemon".format(len(coords)))
for poke_name, poke_coords in coords:
print("Found {0} pokemon at coordinates {1}".format(poke_name, poke_coords))
finally:
driver.quit()
输出:
(.venv35) ➜ Whosebug python pokefinder.py
Found 103 pokemon
There are 85 pokemon on screen
Found coordinates for 27 pokemon
Found Unknown pokemon at coordinates ['39.95481970299595', '-75.18772602081299']
Found Spearow pokemon at coordinates ['39.952878764070974', '-75.18424987792967']
Found Spearow pokemon at coordinates ['39.95625069896077', '-75.18845558166504']
Found Unknown pokemon at coordinates ['39.95685927437669', '-75.18216848373413']
Found Unknown pokemon at coordinates ['39.95174378273782', '-75.17852067947388']
Found Unknown pokemon at coordinates ['39.9509706687274', '-75.17377853393555']
Found Unknown pokemon at coordinates ['39.95241819420643', '-75.17523765563965']
Found Unknown pokemon at coordinates ['39.95409596949794', '-75.17422914505005']
Found Unknown pokemon at coordinates ['39.95131610372689', '-75.17277002334595']
Found Unknown pokemon at coordinates ['39.95276362189558', '-75.17313480377197']
Found Unknown pokemon at coordinates ['39.95254978591276', '-75.17257690429688']
Found Unknown pokemon at coordinates ['39.95319129185564', '-75.17094612121582']
Found Unknown pokemon at coordinates ['39.95488549657055', '-75.17195463180542']
Found Unknown pokemon at coordinates ['39.95488549657055', '-75.17096757888794']
Found Unknown pokemon at coordinates ['39.9571224404468', '-75.17251253128052']
Found Unknown pokemon at coordinates ['39.95633293919831', '-75.17088174819946']
Found Spearow pokemon at coordinates ['39.94958891128449', '-75.1890778541565']
Found Pidgey pokemon at coordinates ['39.94958891128449', '-75.18671751022339']
Found Unknown pokemon at coordinates ['39.94769717428357', '-75.18306970596313']
Found Unknown pokemon at coordinates ['39.948174225938324', '-75.18070936203003']
Found Unknown pokemon at coordinates ['39.94458803200817', '-75.17658948898315']
Found Unknown pokemon at coordinates ['39.94689111392826', '-75.174400806427']
Found Unknown pokemon at coordinates ['39.948322275775425', '-75.1739501953125']
Found Ekans pokemon at coordinates ['39.94749977262573', '-75.17088174819946']
Found Unknown pokemon at coordinates ['39.94842097548884', '-75.17317771911621']
Found Unknown pokemon at coordinates ['39.94934216594682', '-75.17180442810059']
Found Unknown pokemon at coordinates ['39.948075525868894', '-75.17107486724852']
这段代码存在问题有几个原因,其中最主要的是过于宽泛和粗心的异常处理。但是,您应该能够将这个概念应用到更强大的解决方案中。
我想抓取 PokemonVision 以便我可以获得正在显示的 Pokemon 的所有经度和纬度坐标。
网页的URL包含旗标的经纬度。例如下面的 url 包含 39.95142302373031,-75.17986178398132: https://pokevision.com/#/@39.95142302373031,-75.17986178398132
在source code中,标志标记有如下div:
<div class="leaflet-marker-pane"><img class="leaflet-marker-icon leaflet-zoom-animated leaflet-clickable" src="/asset/image/leaflet//marker-icon.png" style="margin-left: -12px; margin-top: -41px; width: 25px; height: 41px; transform: translate(324px, 135px); z-index: 135;" tabindex="0"/>
我还注意到显示的每个口袋妖怪都有一个 div,如下所示:
<div class="leaflet-marker-icon-wrapper leaflet-zoom-animated leaflet-clickable" style="margin-left: 0px; margin-top: 0px; transform: translate(215px, 113px); z-index: 113;" tabindex="0"><img class="leaflet-marker-icon " src="//ugc.pokevision.com/images/pokemon/116.png" style="margin-left: 0px; margin-top: 0px; width: 48px; height: 48px;"/><span class="leaflet-marker-iconlabel home-map-label" style="margin-left: 10px; margin-top: 26px;">08:15</span></div>
我假设口袋妖怪的位置和旗帜标记可以在 div 中找到,尤其是在文本 "transform: translate(" 之后。
考虑到我们知道旗帜的像素位置和经纬度,以及小精灵的像素位置,我相信我应该可以得到小精灵的经纬度。
比如旗标总是在324px、135px,我们知道旗标的gis坐标是39.95142302373031,-75.17986178398132。我们也知道宠物小精灵的坐标(例如 215px、113px)。但是,我似乎无法弄清楚如何获得口袋妖怪的经度和纬度。
如果您单击地图,URL 会更新该点的坐标。您可以在地图上找到所有可见的神奇宝贝,单击它们,然后从更新的 URL 中解析坐标。示例代码:
from pprint import pprint as pp
from selenium import webdriver
from selenium.common.exceptions import WebDriverException
poke_names = {
21: "Spearow",
23: "Ekans",
39: "Jigglypuff",
98: "Krabby",
129: "Pidgey",
}
driver = webdriver.Chrome()
try:
driver.get("https://pokevision.com/#/@39.95142302373031,-75.17986178398132")
# Zoom out once
zoom_css = "a.leaflet-control-zoom-out"
driver.find_element_by_css_selector(zoom_css).click()
# Find all pokemon in the source
poke_css = "div.leaflet-marker-pane div.leaflet-marker-icon-wrapper"
pokemon = driver.find_elements_by_css_selector(poke_css)
print("Found {0} pokemon".format(len(pokemon)))
# Filter for only the ones that are displayed on screen
on_screen_pokemon = [p for p in pokemon if p.is_displayed()]
print("There are {0} pokemon on screen".format(len(on_screen_pokemon)))
# Click each pokemon, which moves the marker and thus updates the URL with
# the coords of that pokemon
coords = list()
for pokemon in on_screen_pokemon:
try:
pokemon.click()
# Example URL: https://ugc.pokevision.com/images/pokemon/21.png
img_url = pokemon.find_element_by_css_selector('img').get_attribute("src")
img_num = int(img_url.split('.png')[0].split('/')[-1])
except WebDriverException:
# Some are hidden by other elements, move on
continue
else:
# Example
# https://pokevision.com/#/@39.95142302373031,-75.17986178398132
poke_coords = driver.current_url.split('#/@')[1].split(',')
poke_name = poke_names[img_num] if img_num in poke_names else "Unknown"
coords.append((poke_name, poke_coords))
print("Found coordinates for {0} pokemon".format(len(coords)))
for poke_name, poke_coords in coords:
print("Found {0} pokemon at coordinates {1}".format(poke_name, poke_coords))
finally:
driver.quit()
输出:
(.venv35) ➜ Whosebug python pokefinder.py
Found 103 pokemon
There are 85 pokemon on screen
Found coordinates for 27 pokemon
Found Unknown pokemon at coordinates ['39.95481970299595', '-75.18772602081299']
Found Spearow pokemon at coordinates ['39.952878764070974', '-75.18424987792967']
Found Spearow pokemon at coordinates ['39.95625069896077', '-75.18845558166504']
Found Unknown pokemon at coordinates ['39.95685927437669', '-75.18216848373413']
Found Unknown pokemon at coordinates ['39.95174378273782', '-75.17852067947388']
Found Unknown pokemon at coordinates ['39.9509706687274', '-75.17377853393555']
Found Unknown pokemon at coordinates ['39.95241819420643', '-75.17523765563965']
Found Unknown pokemon at coordinates ['39.95409596949794', '-75.17422914505005']
Found Unknown pokemon at coordinates ['39.95131610372689', '-75.17277002334595']
Found Unknown pokemon at coordinates ['39.95276362189558', '-75.17313480377197']
Found Unknown pokemon at coordinates ['39.95254978591276', '-75.17257690429688']
Found Unknown pokemon at coordinates ['39.95319129185564', '-75.17094612121582']
Found Unknown pokemon at coordinates ['39.95488549657055', '-75.17195463180542']
Found Unknown pokemon at coordinates ['39.95488549657055', '-75.17096757888794']
Found Unknown pokemon at coordinates ['39.9571224404468', '-75.17251253128052']
Found Unknown pokemon at coordinates ['39.95633293919831', '-75.17088174819946']
Found Spearow pokemon at coordinates ['39.94958891128449', '-75.1890778541565']
Found Pidgey pokemon at coordinates ['39.94958891128449', '-75.18671751022339']
Found Unknown pokemon at coordinates ['39.94769717428357', '-75.18306970596313']
Found Unknown pokemon at coordinates ['39.948174225938324', '-75.18070936203003']
Found Unknown pokemon at coordinates ['39.94458803200817', '-75.17658948898315']
Found Unknown pokemon at coordinates ['39.94689111392826', '-75.174400806427']
Found Unknown pokemon at coordinates ['39.948322275775425', '-75.1739501953125']
Found Ekans pokemon at coordinates ['39.94749977262573', '-75.17088174819946']
Found Unknown pokemon at coordinates ['39.94842097548884', '-75.17317771911621']
Found Unknown pokemon at coordinates ['39.94934216594682', '-75.17180442810059']
Found Unknown pokemon at coordinates ['39.948075525868894', '-75.17107486724852']
这段代码存在问题有几个原因,其中最主要的是过于宽泛和粗心的异常处理。但是,您应该能够将这个概念应用到更强大的解决方案中。