空间连接地理数据帧 returns NA
Spatial joining Geodataframes returns NAs
我正在尝试通过左连接在空间上连接两个地理数据框,但是第二个地理数据框始终具有 NA 值。如果能帮我解决这个问题,我将不胜感激,这是我尝试过的方法:
#1
police = pandas.read_csv("Police.csv")
#2
uk_boundary = pandas.read_csv("sf_boundary.csv")
#3
police_sf = geopandas.GeoDataFrame(
police, geometry=geopandas.points_from_xy(x=police.Longitude, y=police.Latitude)
).set_crs(epsg=4326, inplace=True)
#4
uk_sf = geopandas.GeoDataFrame(
uk_boundary, geometry=geopandas.points_from_xy(x=uk_boundary.Longitude, y=uk_boundary.Latitude)
).set_crs(epsg=4326, inplace=True)
#5
police_sf = police_sf.iloc[:,[4, 6]]
#6
uk_sf = uk_sf.iloc[:,[1, 4]]
join_pd = geopandas.sjoin(police_sf, uk_sf, how="left")
结果数据集示例:
Crime.type geometry \
0 Violence and sexual offences POINT (-0.67902 50.78169)
1 Anti-social behaviour POINT (-2.51692 51.42368)
2 Anti-social behaviour POINT (-2.51277 51.41175)
3 Anti-social behaviour POINT (-2.51444 51.40934)
4 Burglary POINT (-2.51507 51.41936)
... ... ...
18996750 Other theft POINT (-1.75903 50.99465)
18996751 Shoplifting POINT (-1.75155 50.99285)
18996752 Shoplifting POINT (-1.75155 50.99285)
18996753 Violence and sexual offences POINT (-1.74481 50.99320)
18996754 Violence and sexual offences POINT (-1.42572 51.03058)
index_right NAME
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
... ... ...
18996750 NaN NaN
18996751 NaN NaN
18996752 NaN NaN
18996753 NaN NaN
18996754 NaN NaN
两个数据集的一些可重现代码:
#police dataset
{'Longitude': {1: -2.516919,2: -2.512773,3: -2.514442,4: -2.515072,5: -2.49487,6: -2.512773,7: -2.495055,8: -2.516919,9: -2.512773,10: -2.495055,11: -2.495055,12: -2.509126,13: -2.495055,14: -2.509126,15: -2.504309,16: -2.498613,17: -2.497799,18: -2.498613,19: -2.498613},'Latitude': {1: 51.423683,2: 51.411751,3: 51.409343,4: 51.419357,5: 51.422276,6: 51.411751,7: 51.422132,8: 51.423683,9: 51.411751,10: 51.422132,11: 51.422132,12: 51.416137,13: 51.422132,14: 51.416137,15: 51.418801,16: 51.416002,17: 51.415233,18: 51.416002,19: 51.416002},'Crime.type': {1: 'Anti-social behaviour',2: 'Anti-social behaviour',3: 'Anti-social behaviour',4: 'Burglary',5: 'Criminal damage and arson',6: 'Criminal damage and arson',7: 'Drugs',8: 'Public order',9: 'Vehicle crime',10: 'Vehicle crime',11: 'Violence and sexual offences',12: 'Violence and sexual offences',13: 'Violence and sexual offences',14: 'Violence and sexual offences',15: 'Anti-social behaviour',16: 'Anti-social behaviour',17: 'Anti-social behaviour',18: 'Anti-social behaviour',19: 'Anti-social behaviour'}}
#map dataset
{'NAME': {1: 'Buckinghamshire',2: 'Buckinghamshire',3: 'Buckinghamshire',4: 'Buckinghamshire',5: 'Buckinghamshire',6: 'Buckinghamshire',7: 'Buckinghamshire',8: 'Buckinghamshire',9: 'Buckinghamshire',10: 'Buckinghamshire',11: 'Buckinghamshire',12: 'Buckinghamshire',13: 'Buckinghamshire',14: 'Buckinghamshire',15: 'Buckinghamshire',16: 'Buckinghamshire',17: 'Buckinghamshire',18: 'Buckinghamshire',19: 'Buckinghamshire'},'Longitude': {1: -0.500579742731822,2: -0.500562231052187,3: -0.500551492843239,4: -0.50060557136444,5: -0.50060719049165,6: -0.500600124159461,7: -0.500586635353007,8: -0.500565771696397,9: -0.500521784112314,10: -0.500121547252066,11: -0.499775648553165,12: -0.499629899275452,13: -0.499526010336186,14: -0.499516042742561,15: -0.49949926212727,16: -0.499472156394348,17: -0.499454478837858,18: -0.499422548303929,19: -0.499384521025904},'Latitude': {1: 51.5995448459169,2: 51.5994186801437,3: 51.5992603285579,4: 51.5988473256556,5: 51.5987547546975,6: 51.5986620736993,7: 51.5985908842995,8: 51.5985195116762,9: 51.5984092787353,10: 51.5976722454391,11: 51.5969555922043,12: 51.5966461998169,13: 51.5963814814086,14: 51.5959397929587,15: 51.5957012684416,16: 51.5954725894707,17: 51.5953949647129,18: 51.5953127525633,19: 51.5952421484913}}
SJOIN 需要空间关系。由于你要连接的两个数据都是点,你无法捕捉到它们之间的拓扑关系。 sjoin的默认op是相交的,所以应该return只有相交的部分。
我正在尝试通过左连接在空间上连接两个地理数据框,但是第二个地理数据框始终具有 NA 值。如果能帮我解决这个问题,我将不胜感激,这是我尝试过的方法:
#1
police = pandas.read_csv("Police.csv")
#2
uk_boundary = pandas.read_csv("sf_boundary.csv")
#3
police_sf = geopandas.GeoDataFrame(
police, geometry=geopandas.points_from_xy(x=police.Longitude, y=police.Latitude)
).set_crs(epsg=4326, inplace=True)
#4
uk_sf = geopandas.GeoDataFrame(
uk_boundary, geometry=geopandas.points_from_xy(x=uk_boundary.Longitude, y=uk_boundary.Latitude)
).set_crs(epsg=4326, inplace=True)
#5
police_sf = police_sf.iloc[:,[4, 6]]
#6
uk_sf = uk_sf.iloc[:,[1, 4]]
join_pd = geopandas.sjoin(police_sf, uk_sf, how="left")
结果数据集示例:
Crime.type geometry \
0 Violence and sexual offences POINT (-0.67902 50.78169)
1 Anti-social behaviour POINT (-2.51692 51.42368)
2 Anti-social behaviour POINT (-2.51277 51.41175)
3 Anti-social behaviour POINT (-2.51444 51.40934)
4 Burglary POINT (-2.51507 51.41936)
... ... ...
18996750 Other theft POINT (-1.75903 50.99465)
18996751 Shoplifting POINT (-1.75155 50.99285)
18996752 Shoplifting POINT (-1.75155 50.99285)
18996753 Violence and sexual offences POINT (-1.74481 50.99320)
18996754 Violence and sexual offences POINT (-1.42572 51.03058)
index_right NAME
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
... ... ...
18996750 NaN NaN
18996751 NaN NaN
18996752 NaN NaN
18996753 NaN NaN
18996754 NaN NaN
两个数据集的一些可重现代码:
#police dataset
{'Longitude': {1: -2.516919,2: -2.512773,3: -2.514442,4: -2.515072,5: -2.49487,6: -2.512773,7: -2.495055,8: -2.516919,9: -2.512773,10: -2.495055,11: -2.495055,12: -2.509126,13: -2.495055,14: -2.509126,15: -2.504309,16: -2.498613,17: -2.497799,18: -2.498613,19: -2.498613},'Latitude': {1: 51.423683,2: 51.411751,3: 51.409343,4: 51.419357,5: 51.422276,6: 51.411751,7: 51.422132,8: 51.423683,9: 51.411751,10: 51.422132,11: 51.422132,12: 51.416137,13: 51.422132,14: 51.416137,15: 51.418801,16: 51.416002,17: 51.415233,18: 51.416002,19: 51.416002},'Crime.type': {1: 'Anti-social behaviour',2: 'Anti-social behaviour',3: 'Anti-social behaviour',4: 'Burglary',5: 'Criminal damage and arson',6: 'Criminal damage and arson',7: 'Drugs',8: 'Public order',9: 'Vehicle crime',10: 'Vehicle crime',11: 'Violence and sexual offences',12: 'Violence and sexual offences',13: 'Violence and sexual offences',14: 'Violence and sexual offences',15: 'Anti-social behaviour',16: 'Anti-social behaviour',17: 'Anti-social behaviour',18: 'Anti-social behaviour',19: 'Anti-social behaviour'}}
#map dataset
{'NAME': {1: 'Buckinghamshire',2: 'Buckinghamshire',3: 'Buckinghamshire',4: 'Buckinghamshire',5: 'Buckinghamshire',6: 'Buckinghamshire',7: 'Buckinghamshire',8: 'Buckinghamshire',9: 'Buckinghamshire',10: 'Buckinghamshire',11: 'Buckinghamshire',12: 'Buckinghamshire',13: 'Buckinghamshire',14: 'Buckinghamshire',15: 'Buckinghamshire',16: 'Buckinghamshire',17: 'Buckinghamshire',18: 'Buckinghamshire',19: 'Buckinghamshire'},'Longitude': {1: -0.500579742731822,2: -0.500562231052187,3: -0.500551492843239,4: -0.50060557136444,5: -0.50060719049165,6: -0.500600124159461,7: -0.500586635353007,8: -0.500565771696397,9: -0.500521784112314,10: -0.500121547252066,11: -0.499775648553165,12: -0.499629899275452,13: -0.499526010336186,14: -0.499516042742561,15: -0.49949926212727,16: -0.499472156394348,17: -0.499454478837858,18: -0.499422548303929,19: -0.499384521025904},'Latitude': {1: 51.5995448459169,2: 51.5994186801437,3: 51.5992603285579,4: 51.5988473256556,5: 51.5987547546975,6: 51.5986620736993,7: 51.5985908842995,8: 51.5985195116762,9: 51.5984092787353,10: 51.5976722454391,11: 51.5969555922043,12: 51.5966461998169,13: 51.5963814814086,14: 51.5959397929587,15: 51.5957012684416,16: 51.5954725894707,17: 51.5953949647129,18: 51.5953127525633,19: 51.5952421484913}}
SJOIN 需要空间关系。由于你要连接的两个数据都是点,你无法捕捉到它们之间的拓扑关系。 sjoin的默认op是相交的,所以应该return只有相交的部分。