带地址的 R 地理编码
R-Geocoding with Address
我有 32K 行地址,我必须为其找到 long/latitude 值。
我正在使用找到的代码 here。我非常感谢这个人创造了它,但我有一个问题:
我想对其进行编辑,以便如果循环遇到当前行地址的问题,它只需在 Lat/Long 字段中声明 NA 并移至下一个。有谁知道如何实现?代码如下:
# Geocoding a csv column of "addresses" in R
#load ggmap
library(ggmap)
# Select the file from the file chooser
fileToLoad <- file.choose(new = TRUE)
# Read in the CSV data and store it in a variable
origAddress <- read.csv(fileToLoad, stringsAsFactors = FALSE)
# Initialize the data frame
geocoded <- data.frame(stringsAsFactors = FALSE)
# Loop through the addresses to get the latitude and longitude of each address and add it to the
# origAddress data frame in new columns lat and lon
for(i in 1:nrow(origAddress))
{
# Print("Working...")
result <- geocode(origAddress$addresses[i], output = "latlona", source = "google")
origAddress$lon[i] <- as.numeric(result[1])
origAddress$lat[i] <- as.numeric(result[2])
origAddress$geoAddress[i] <- as.character(result[3])
}
# Write a CSV file containing origAddress to the working directory
write.csv(origAddress, "geocoded.csv", row.names=FALSE)
您可以使用 tryCatch()
来隔离地理编码警告,并且 return 具有与 geocode()
相同结构(经度、纬度、地址)的 data.frame return.
您的代码将是
# Geocoding a csv column of "addresses" in R
# load ggmap
library(ggmap)
# Select the file from the file chooser
fileToLoad <- file.choose(new = TRUE)
# Read in the CSV data and store it in a variable
origAddress <- read.csv(fileToLoad, stringsAsFactors = FALSE)
# Loop through the addresses to get the latitude and longitude of each address and add it to the
# origAddress data frame in new columns lat and lon
for(i in 1:nrow(origAddress)) {
result <- tryCatch(geocode(origAddress$addresses[i], output = "latlona", source = "google"),
warning = function(w) data.frame(lon = NA, lat = NA, address = NA))
origAddress$lon[i] <- as.numeric(result[1])
origAddress$lat[i] <- as.numeric(result[2])
origAddress$geoAddress[i] <- as.character(result[3])
}
# Write a CSV file containing origAddress to the working directory
write.csv(origAddress, "geocoded.csv", row.names=FALSE)
或者,您可以更快、更干净地执行此操作,而无需循环和错误检查。但是,如果没有可重现的数据示例,则无法知道这是否会保留您需要的所有信息。
# Substituted for for loop
result <- geocode(origAddress$addresses, output = "latlona", source = "google")
origAddress <- cbind(origAddress$addresses, result)
我有 32K 行地址,我必须为其找到 long/latitude 值。
我正在使用找到的代码 here。我非常感谢这个人创造了它,但我有一个问题:
我想对其进行编辑,以便如果循环遇到当前行地址的问题,它只需在 Lat/Long 字段中声明 NA 并移至下一个。有谁知道如何实现?代码如下:
# Geocoding a csv column of "addresses" in R
#load ggmap
library(ggmap)
# Select the file from the file chooser
fileToLoad <- file.choose(new = TRUE)
# Read in the CSV data and store it in a variable
origAddress <- read.csv(fileToLoad, stringsAsFactors = FALSE)
# Initialize the data frame
geocoded <- data.frame(stringsAsFactors = FALSE)
# Loop through the addresses to get the latitude and longitude of each address and add it to the
# origAddress data frame in new columns lat and lon
for(i in 1:nrow(origAddress))
{
# Print("Working...")
result <- geocode(origAddress$addresses[i], output = "latlona", source = "google")
origAddress$lon[i] <- as.numeric(result[1])
origAddress$lat[i] <- as.numeric(result[2])
origAddress$geoAddress[i] <- as.character(result[3])
}
# Write a CSV file containing origAddress to the working directory
write.csv(origAddress, "geocoded.csv", row.names=FALSE)
您可以使用 tryCatch()
来隔离地理编码警告,并且 return 具有与 geocode()
相同结构(经度、纬度、地址)的 data.frame return.
您的代码将是
# Geocoding a csv column of "addresses" in R
# load ggmap
library(ggmap)
# Select the file from the file chooser
fileToLoad <- file.choose(new = TRUE)
# Read in the CSV data and store it in a variable
origAddress <- read.csv(fileToLoad, stringsAsFactors = FALSE)
# Loop through the addresses to get the latitude and longitude of each address and add it to the
# origAddress data frame in new columns lat and lon
for(i in 1:nrow(origAddress)) {
result <- tryCatch(geocode(origAddress$addresses[i], output = "latlona", source = "google"),
warning = function(w) data.frame(lon = NA, lat = NA, address = NA))
origAddress$lon[i] <- as.numeric(result[1])
origAddress$lat[i] <- as.numeric(result[2])
origAddress$geoAddress[i] <- as.character(result[3])
}
# Write a CSV file containing origAddress to the working directory
write.csv(origAddress, "geocoded.csv", row.names=FALSE)
或者,您可以更快、更干净地执行此操作,而无需循环和错误检查。但是,如果没有可重现的数据示例,则无法知道这是否会保留您需要的所有信息。
# Substituted for for loop
result <- geocode(origAddress$addresses, output = "latlona", source = "google")
origAddress <- cbind(origAddress$addresses, result)