如何使用 python 和硒从地图(例如 Pokevision)中抓取 GIS 坐标?

How to scrape GIS coordinates from a map (e.g. Pokevision) using python and selenium?

我想抓取 PokemonVision 以便我可以获得正在显示的 Pokemon 的所有经度和纬度坐标。

网页的URL包含旗标的经纬度。例如下面的 url 包含 39.95142302373031,-75.17986178398132: https://pokevision.com/#/@39.95142302373031,-75.17986178398132

source code中,标志标记有如下div:

<div class="leaflet-marker-pane"><img class="leaflet-marker-icon leaflet-zoom-animated leaflet-clickable" src="/asset/image/leaflet//marker-icon.png" style="margin-left: -12px; margin-top: -41px; width: 25px; height: 41px; transform: translate(324px, 135px); z-index: 135;" tabindex="0"/>

我还注意到显示的每个口袋妖怪都有一个 div,如下所示:

<div class="leaflet-marker-icon-wrapper leaflet-zoom-animated leaflet-clickable" style="margin-left: 0px; margin-top: 0px; transform: translate(215px, 113px); z-index: 113;" tabindex="0"><img class="leaflet-marker-icon " src="//ugc.pokevision.com/images/pokemon/116.png" style="margin-left: 0px; margin-top: 0px; width: 48px; height: 48px;"/><span class="leaflet-marker-iconlabel home-map-label" style="margin-left: 10px; margin-top: 26px;">08:15</span></div>

我假设口袋妖怪的位置和旗帜标记可以在 div 中找到,尤其是在文本 "transform: translate(" 之后。

考虑到我们知道旗帜的像素位置和经纬度,以及小精灵的像素位置,我相信我应该可以得到小精灵的经纬度。

比如旗标总是在324px、135px,我们知道旗标的gis坐标是39.95142302373031,-75.17986178398132。我们也知道宠物小精灵的坐标(例如 215px、113px)。但是,我似乎无法弄清楚如何获得口袋妖怪的经度和纬度。

如果您单击地图,URL 会更新该点的坐标。您可以在地图上找到所有可见的神奇宝贝,单击它们,然后从更新的 URL 中解析坐标。示例代码:

from pprint import pprint as pp

from selenium import webdriver
from selenium.common.exceptions import WebDriverException

poke_names = {
    21: "Spearow",
    23: "Ekans",
    39: "Jigglypuff",
    98: "Krabby",
    129: "Pidgey",

}

driver = webdriver.Chrome()
try:
    driver.get("https://pokevision.com/#/@39.95142302373031,-75.17986178398132")

    # Zoom out once
    zoom_css = "a.leaflet-control-zoom-out"
    driver.find_element_by_css_selector(zoom_css).click()

    # Find all pokemon in the source
    poke_css = "div.leaflet-marker-pane div.leaflet-marker-icon-wrapper"
    pokemon = driver.find_elements_by_css_selector(poke_css)
    print("Found {0} pokemon".format(len(pokemon)))

    # Filter for only the ones that are displayed on screen
    on_screen_pokemon = [p for p in pokemon if p.is_displayed()]
    print("There are {0} pokemon on screen".format(len(on_screen_pokemon)))

    # Click each pokemon, which moves the marker and thus updates the URL with
    # the coords of that pokemon
    coords = list()
    for pokemon in on_screen_pokemon:
        try:
            pokemon.click()
            # Example URL: https://ugc.pokevision.com/images/pokemon/21.png
            img_url = pokemon.find_element_by_css_selector('img').get_attribute("src")
            img_num = int(img_url.split('.png')[0].split('/')[-1])
        except WebDriverException:
            # Some are hidden by other elements, move on
            continue
        else:
            # Example
            # https://pokevision.com/#/@39.95142302373031,-75.17986178398132
            poke_coords = driver.current_url.split('#/@')[1].split(',')
            poke_name = poke_names[img_num] if img_num in poke_names else "Unknown"
            coords.append((poke_name, poke_coords))

    print("Found coordinates for {0} pokemon".format(len(coords)))
    for poke_name, poke_coords in coords:
        print("Found {0} pokemon at coordinates {1}".format(poke_name, poke_coords))

finally:
    driver.quit()

输出:

(.venv35) ➜  Whosebug python pokefinder.py
Found 103 pokemon
There are 85 pokemon on screen
Found coordinates for 27 pokemon
Found Unknown pokemon at coordinates ['39.95481970299595', '-75.18772602081299']
Found Spearow pokemon at coordinates ['39.952878764070974', '-75.18424987792967']
Found Spearow pokemon at coordinates ['39.95625069896077', '-75.18845558166504']
Found Unknown pokemon at coordinates ['39.95685927437669', '-75.18216848373413']
Found Unknown pokemon at coordinates ['39.95174378273782', '-75.17852067947388']
Found Unknown pokemon at coordinates ['39.9509706687274', '-75.17377853393555']
Found Unknown pokemon at coordinates ['39.95241819420643', '-75.17523765563965']
Found Unknown pokemon at coordinates ['39.95409596949794', '-75.17422914505005']
Found Unknown pokemon at coordinates ['39.95131610372689', '-75.17277002334595']
Found Unknown pokemon at coordinates ['39.95276362189558', '-75.17313480377197']
Found Unknown pokemon at coordinates ['39.95254978591276', '-75.17257690429688']
Found Unknown pokemon at coordinates ['39.95319129185564', '-75.17094612121582']
Found Unknown pokemon at coordinates ['39.95488549657055', '-75.17195463180542']
Found Unknown pokemon at coordinates ['39.95488549657055', '-75.17096757888794']
Found Unknown pokemon at coordinates ['39.9571224404468', '-75.17251253128052']
Found Unknown pokemon at coordinates ['39.95633293919831', '-75.17088174819946']
Found Spearow pokemon at coordinates ['39.94958891128449', '-75.1890778541565']
Found Pidgey pokemon at coordinates ['39.94958891128449', '-75.18671751022339']
Found Unknown pokemon at coordinates ['39.94769717428357', '-75.18306970596313']
Found Unknown pokemon at coordinates ['39.948174225938324', '-75.18070936203003']
Found Unknown pokemon at coordinates ['39.94458803200817', '-75.17658948898315']
Found Unknown pokemon at coordinates ['39.94689111392826', '-75.174400806427']
Found Unknown pokemon at coordinates ['39.948322275775425', '-75.1739501953125']
Found Ekans pokemon at coordinates ['39.94749977262573', '-75.17088174819946']
Found Unknown pokemon at coordinates ['39.94842097548884', '-75.17317771911621']
Found Unknown pokemon at coordinates ['39.94934216594682', '-75.17180442810059']
Found Unknown pokemon at coordinates ['39.948075525868894', '-75.17107486724852']

这段代码存在问题有几个原因,其中最主要的是过于宽泛和粗心的异常处理。但是,您应该能够将这个概念应用到更强大的解决方案中。