Geopy,pandas,FOR 循环失败
Geopy, pandas, FOR loop fail
我在自学geopy
。看起来简单明了,但我的代码不起作用。它应该:
- 将 CSV 中的地址字段列表读入
pandas
df
- 将地址字段连接成一个格式为
geopy
的列
- 根据新列制作一个列表
- 通过 for 循环将列表中的每个项目输入
geopy
并且 return 坐标添加
坐标到原始 df 并将其导出到 CSV
#setup
from geopy.geocoders import Nominatim
import pandas as pd
#create the df
df = pd.DataFrame(pd.read_csv('properties to geocode.csv'))
df['Location'] = df['Street Address'].astype(str)+","+df['City'].astype(str)+","+df['State'].astype(str)
#create the geolocator object
geolocator = Nominatim(timeout=1, user_agent = "My_Agent")
#create the locations list
locations = df['Location']
#empty lists for later columns
lats = []
longs = []
#process the location list
for item in locations:
location = geolocator.geocode('item')
lat = location.latitude
long = location.longitude
lats.append(lat)
longs.append(long)
#add the lists to the df
df.insert(5,'Latitude',lats)
df.insert(6,'Longitude',longs)
#export
df.to_csv('geocoded-properties2.csv',index=False)
有些东西不工作,因为它return每行的纬度和经度值相同,而不是每行的唯一坐标。
我在其他地方找到了使用 .apply 的工作代码,但有兴趣了解我做错了什么。有什么想法吗?
- 您的代码不包含样本数据。已使用 public API 提供的一些示例数据来演示
- 您的代码将文字传递给
geolocator.geocode()
- 它需要是与行关联的地址
- 提供了使用 pandas
apply
、列表理解 和 for
相当于 comprehension 的循环
- 结果显示所有三种方法都是等效的
from geopy.geocoders import Nominatim
import requests
import pandas as pd
searchendpoint = "https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations"
# get all healthcare facilities in Herefordshire
dfhc = pd.concat([pd.json_normalize(requests
.get(searchendpoint, params={"PostCode":f"HR{i}","Status":"Active"})
.json()["Organisations"])
for i in range(1,10)]).reset_index(drop=True)
def gps(url, geolocator=None):
# get the address and construct a space delimted string
a = " ".join(str(x) for x in requests.get(url).json()["Organisation"]["GeoLoc"]["Location"].values())
lonlat = geolocator.geocode(a)
if not lonlat is None:
return lonlat[1]
else:
return (0,0)
# work with just GPs
dfgp = dfhc.loc[dfhc.PrimaryRoleId.isin(["RO180","RO96"])].head(5).copy()
geolocator = Nominatim(timeout=1, user_agent = "My_Agent")
# pandas apply
dfgp["lonlat_apply"] = dfgp["OrgLink"].apply(gps, geolocator=geolocator)
# list comprehension
lonlat = [gps(url, geolocator=geolocator) for url in dfgp["OrgLink"].values]
dfgp["lonlat_listcomp"] = lonlat
# old school loop
lonlat = []
for item in dfgp["OrgLink"].values:
lonlat.append(gps(item, geolocator=geolocator))
dfgp["lonlat_oldschool"] = lonlat
Name
OrgId
Status
OrgRecordClass
PostCode
LastChangeDate
PrimaryRoleId
PrimaryRoleDescription
OrgLink
lonlat_apply
lonlat_listcomp
lonlat_oldschool
7
AYLESTONE HILL SURGERY
M81026002
Active
RC2
HR1 1HR
2020-03-19
RO96
BRANCH SURGERY
https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/M81026002
(52.0612429, -2.7026047)
(52.0612429, -2.7026047)
(52.0612429, -2.7026047)
9
BARRS COURT SCHOOL
5CN91
Active
RC2
HR1 1EQ
2021-01-28
RO180
PRIMARY CARE TRUST SITE
https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/5CN91
(52.0619209, -2.7086105)
(52.0619209, -2.7086105)
(52.0619209, -2.7086105)
13
BODENHAM SURGERY
5CN24
Active
RC2
HR1 3JU
2013-05-08
RO180
PRIMARY CARE TRUST SITE
https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/5CN24
(52.152405, -2.6671942)
(52.152405, -2.6671942)
(52.152405, -2.6671942)
22
BELMONT ABBEY
5CN16
Active
RC2
HR2 9RP
2013-05-08
RO180
PRIMARY CARE TRUST SITE
https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/5CN16
(52.0423056, -2.7648698)
(52.0423056, -2.7648698)
(52.0423056, -2.7648698)
24
BELMONT HEALTH CENTRE
5CN22
Active
RC2
HR2 7XT
2013-05-08
RO180
PRIMARY CARE TRUST SITE
https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/5CN22
(52.0407746, -2.739788)
(52.0407746, -2.739788)
(52.0407746, -2.739788)
我在自学geopy
。看起来简单明了,但我的代码不起作用。它应该:
- 将 CSV 中的地址字段列表读入
pandas
df - 将地址字段连接成一个格式为
geopy
的列
- 根据新列制作一个列表
- 通过 for 循环将列表中的每个项目输入
geopy
并且 return 坐标添加 坐标到原始 df 并将其导出到 CSV
#setup
from geopy.geocoders import Nominatim
import pandas as pd
#create the df
df = pd.DataFrame(pd.read_csv('properties to geocode.csv'))
df['Location'] = df['Street Address'].astype(str)+","+df['City'].astype(str)+","+df['State'].astype(str)
#create the geolocator object
geolocator = Nominatim(timeout=1, user_agent = "My_Agent")
#create the locations list
locations = df['Location']
#empty lists for later columns
lats = []
longs = []
#process the location list
for item in locations:
location = geolocator.geocode('item')
lat = location.latitude
long = location.longitude
lats.append(lat)
longs.append(long)
#add the lists to the df
df.insert(5,'Latitude',lats)
df.insert(6,'Longitude',longs)
#export
df.to_csv('geocoded-properties2.csv',index=False)
有些东西不工作,因为它return每行的纬度和经度值相同,而不是每行的唯一坐标。
我在其他地方找到了使用 .apply 的工作代码,但有兴趣了解我做错了什么。有什么想法吗?
- 您的代码不包含样本数据。已使用 public API 提供的一些示例数据来演示
- 您的代码将文字传递给
geolocator.geocode()
- 它需要是与行关联的地址 - 提供了使用 pandas
apply
、列表理解 和for
相当于 comprehension 的循环
- 结果显示所有三种方法都是等效的
from geopy.geocoders import Nominatim
import requests
import pandas as pd
searchendpoint = "https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations"
# get all healthcare facilities in Herefordshire
dfhc = pd.concat([pd.json_normalize(requests
.get(searchendpoint, params={"PostCode":f"HR{i}","Status":"Active"})
.json()["Organisations"])
for i in range(1,10)]).reset_index(drop=True)
def gps(url, geolocator=None):
# get the address and construct a space delimted string
a = " ".join(str(x) for x in requests.get(url).json()["Organisation"]["GeoLoc"]["Location"].values())
lonlat = geolocator.geocode(a)
if not lonlat is None:
return lonlat[1]
else:
return (0,0)
# work with just GPs
dfgp = dfhc.loc[dfhc.PrimaryRoleId.isin(["RO180","RO96"])].head(5).copy()
geolocator = Nominatim(timeout=1, user_agent = "My_Agent")
# pandas apply
dfgp["lonlat_apply"] = dfgp["OrgLink"].apply(gps, geolocator=geolocator)
# list comprehension
lonlat = [gps(url, geolocator=geolocator) for url in dfgp["OrgLink"].values]
dfgp["lonlat_listcomp"] = lonlat
# old school loop
lonlat = []
for item in dfgp["OrgLink"].values:
lonlat.append(gps(item, geolocator=geolocator))
dfgp["lonlat_oldschool"] = lonlat
Name | OrgId | Status | OrgRecordClass | PostCode | LastChangeDate | PrimaryRoleId | PrimaryRoleDescription | OrgLink | lonlat_apply | lonlat_listcomp | lonlat_oldschool | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
7 | AYLESTONE HILL SURGERY | M81026002 | Active | RC2 | HR1 1HR | 2020-03-19 | RO96 | BRANCH SURGERY | https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/M81026002 | (52.0612429, -2.7026047) | (52.0612429, -2.7026047) | (52.0612429, -2.7026047) |
9 | BARRS COURT SCHOOL | 5CN91 | Active | RC2 | HR1 1EQ | 2021-01-28 | RO180 | PRIMARY CARE TRUST SITE | https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/5CN91 | (52.0619209, -2.7086105) | (52.0619209, -2.7086105) | (52.0619209, -2.7086105) |
13 | BODENHAM SURGERY | 5CN24 | Active | RC2 | HR1 3JU | 2013-05-08 | RO180 | PRIMARY CARE TRUST SITE | https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/5CN24 | (52.152405, -2.6671942) | (52.152405, -2.6671942) | (52.152405, -2.6671942) |
22 | BELMONT ABBEY | 5CN16 | Active | RC2 | HR2 9RP | 2013-05-08 | RO180 | PRIMARY CARE TRUST SITE | https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/5CN16 | (52.0423056, -2.7648698) | (52.0423056, -2.7648698) | (52.0423056, -2.7648698) |
24 | BELMONT HEALTH CENTRE | 5CN22 | Active | RC2 | HR2 7XT | 2013-05-08 | RO180 | PRIMARY CARE TRUST SITE | https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/5CN22 | (52.0407746, -2.739788) | (52.0407746, -2.739788) | (52.0407746, -2.739788) |