根据存储在 csv 中的 place/store 名称和城市检索完整地址和地理编码

Retrieving full address and geocoding based on place/store name and city stored in csv

我有一个包含 2 个字段的 csv 文件,store_namecity。一个城市可以有多家门店。
我想要一个包含 5 个字段的输出 csv,store_namecityaddresslatitudelongitude.

例如,如果 csv 的一个条目是 Starbucks, Chicago,我希望输出 csv 包含 5 个字段(上面提到的)中的所有信息,如:
Starbucks, Chicago, "200 S Michigan Ave, Chicago, IL 60604, USA", 41.8164613, -87.8127855,
Starbucks, Chicago, "8 N Michigan Ave, Chicago, IL 60602, USA", 41.8164613, -87.8127855
其余结果依此类推。

在使用 Google 地图 API 之前,我尝试使用 Nomanitim 通过 GeoPy 来解决这个问题。虽然我不知道什么是最好的方法来解决这个问题。请注意,在源 csv 中有一百万个这样的条目,但是购买 API 密钥一旦起作用就不是问题了。

我确实只尝试使用 pandas 使用 Nominatim 进行地理编码,但这只会在每个条目的输出 csv 中创建一个结果。我想按照上面的示例中的说明获取每个结果。不确定如何实施。

from geopy.geocoders import Nominatim
import csv, sys
import pandas as pd
import keys

in_file = str(sys.argv[1])
out_file = str('gc_' + in_file)
timeout = int(sys.argv[2])

nominatim = Nominatim(user_agent=your_key_here, timeout=timeout)

def gc(address):
    name = str(address['store_name'])
    city = str(address['city'])
    add_concat = name + ", " + city
    location = nominatim.geocode(add_concat)
    if location != None:
        print(f'geocoded record {address.name}: {city}')
        located = pd.Series({
            'lat': location.latitude,
            'lng': location.longitude,
        })
    else:
        print(f'failed to geolocate record {address.name}: {city}')
        located = pd.Series({
            'lat': 'null',
            'lng': 'null',
        })
    return located

print('opening input.')
reader = pd.read_csv(in_file, header=0)
print('geocoding addresses.')
reader = reader.merge(reader.apply(lambda add: gc(add), axis=1), left_index=True, right_index=True)
print(f'writing to {out_file}.')
reader.to_csv(out_file, encoding='utf-8', index=False)
print('done.')

您可以使用 reverse geocoding 来达到这个目的。根据官方文档 here,这是一种将地理坐标转换为 human-readable 地址的方法。

我在我的一个项目中使用了以下功能,它仍然有效。您可以根据自己的要求进行修改。

import requests

GCODE_URL = 'https://maps.googleapis.com/maps/api/geocode/json?'
GCODE_KEY = 'YOUR API KEY' 

def reverse_gcode(location):
    location = str(location).replace(' ','+')
    nav_req = 'address={}&key={}'.format(location,GCODE_KEY)
    request = GCODE_URL + nav_req
    result = requests.get(request)
    data = result.json()
    status = data['status']

    geo_location = {}
    if str(status) == "OK":
       sizeofjson = len(data['results'][0]['address_components'])
       for i in range(sizeofjson):
           sizeoftype = len(data['results'][0]['address_components'][i]['types'])
           if sizeoftype == 3:
              geo_location[data['results'][0]['address_components'][i]['types'][2]] = data['results'][0]['address_components'][i]['long_name']

           else:
              if data['results'][0]['address_components'][i]['types'][0] == 'administrative_area_level_1':
                 geo_location['state'] = data['results'][0]['address_components'][i]['long_name']

              elif data['results'][0]['address_components'][i]['types'][0] == 'administrative_area_level_2':
                   geo_location['city'] = data['results'][0]['address_components'][i]['long_name']
                   geo_location['town'] = geo_location['city']

               else:
                   geo_location[data['results'][0]['address_components'][i]['types'][0] ]= data['results'][0]['address_components'][i]['long_name']

       formatted_address = data['results'][0]['formatted_address']
       geo_location['lat'] = data['results'][0]['geometry']['location']['lat']
       geo_location['lang']= data['results'][0]['geometry']['location']['lng']
       geo_location['formatted_address']=formatted_address

       return geo_location

print(reverse_gcode("Starbucks, Chicago"))

输出将采用 json 格式,看起来像这样:

{'street_number': '8', 'town': 'Cook County', 'locality': 'Chicago', 'city': 'Cook County', 'lat': 41.882413, 'neighborhood': 'Chicago Loop', 'route': 'North Michigan Avenue', 'lang': -87.62468799999999, 'postal_code': '60602', 'country': 'United States', 'formatted_address': '8 N Michigan Ave, Chicago, IL 60602, USA', 'state': 'Illinois'}