在 R 中使用 sf - 将几何添加到大点数据集的最佳方法是什么?

Using sf in R - what is the best way to add geometry to large point dataset?

免责声明:我才刚刚开始使用 sf,所以我可能(希望如此!)在这里遗漏了一些明显的东西。

我有 AusGeoid2020 data which consists of 15,454,800 points and some attributes to convert between ellipsoidal heights (i.e. GPS height) and the AHD.

虽然文件很大 (914Mb),但读起来还是很容易的:

library(plyr)
library(magrittr)
library(dplyr)
library(readr)
library(sf)

AusGeoid2020 <- read_fwf(
  file = "AUSGeoid2020_20170908_win.dat",
  col_positions = fwf_widths(
    widths = c(3L,9L,2L,2L,3L,7L,2L,3L,3L,7L,10L,10L),
    col_names = c(
      "ID",
      "ellipsoid to AHD separation (m)",
      "Latitude (hem)",
      "Latitude (deg)",
      "Latitude (min)",
      "Latitude (sec)",
      "Longitude (hem)",
      "Longitude (deg)",
      "Longitude (min)",
      "Longitude (sec)",
      "deflection of the vertical (seconds, xi)",
      "deflection of the vertical (seconds, eta)"
    )
  ),
  col_types = cols(
    ID = col_character(),
    `ellipsoid to AHD separation (m)` = col_double(),
    `Latitude (hem)` = col_character(),
    `Latitude (deg)` = col_double(),
    `Latitude (min)` = col_double(),
    `Latitude (sec)` = col_double(),
    `Longitude (hem)` = col_character(),
    `Longitude (deg)` = col_double(),
    `Longitude (min)` = col_double(),
    `Longitude (sec)` = col_double(),
    `deflection of the vertical (seconds, xi)` = col_double(),
    `deflection of the vertical (seconds, eta)` = col_double()
  ),
  skip = 1L
)

AusGeoid2020 <- AusGeoid2020 %>% 
  mutate(
    Latitude = `Latitude (deg)` + (`Latitude (min)`/60) + (`Latitude (sec)`/3600),
    Latitude = case_when(
      `Latitude (hem)` == "S" ~ -1 * Latitude,
      TRUE ~ Latitude
    ),
    Longitude = `Longitude (deg)` + (`Longitude (min)`/60) + (`Longitude (sec)`/3600),
    Longitude = case_when(
      `Longitude (hem)` == "W" ~ -1 * Longitude,
      TRUE ~ Longitude
    )
  ) %>% 
  select(
    ID,
    `ellipsoid to AHD separation (m)`,
    Latitude,
    Longitude,
    `deflection of the vertical (seconds, xi)`,
    `deflection of the vertical (seconds, eta)`
  )

我的问题是:向这个大型数据框添加几何图形的最佳方法是什么?我相信我想要的函数是 st_point() ,它没有被矢量化,所以我求助于使用 {plyr} 中的 alply() 来创建几何列,但这是 very 资源密集型,这让我觉得一定有更好的方法。

st_geometry(AusGeoid2020) <- st_sfc(
  alply(AusGeoid2020, 1, function(row) {
    st_point(x = c(row$Longitude, row$Latitude), dim = "XY")
  }),
  crs = 7844L
)

这需要很长时间。任何建议表示赞赏!

我们可以使用st_as_sf如下。默认设置将删除带有坐标信息的列(在本例中为 LongitudeLatitude)。如果要保留这些列,请设置 remove = FALSE.

AusGeoid2020_sf <- AusGeoid2020 %>% 
  st_as_sf(coords = c("Longitude", "Latitude"), crs = 7844L, remove = FALSE)