R:缩写字符串中的状态名称

R: Abbreviate state names in strings

我有带有州名的字符串。我如何有效地缩写它们?我知道 state.abb[grep("New York", state.name)] 但这只有在 "New York" 是整个字符串时才有效。例如,我有 "Walmart, New York"。提前致谢!

让我们假设这个输入:

x = c("Walmart, New York", "Hobby Lobby (California)", "Sold in Sears in Illinois")

编辑:所需的输出将是 "Walmart, NY"、"Hobby Lobby (CA)"、"Sold in Sears in IL"。从这里可以看出,状态可以以多种方式出现在字符串中

这是基本的 R 方法,使用 gregexpr()regmatches()regmatches<-(),:

abbreviateStateNames <- function(x) {
    pat <- paste(state.name, collapse="|")
    m <- gregexpr(pat, x)
    ff <- function(x) state.abb[match(x, state.name)]
    regmatches(x, m) <- lapply(regmatches(x, m), ff)
    x
}

x <- c("Hobby Lobby (California)", 
       "Hello New York City, here I come (from Greensboro North Carolina)!")

abbreviateStateNames(x)
# [1] "Hobby Lobby (CA)"                                
# [2] "Hello NY City, here I come (from Greensboro NC)!"

或者——更自然地——你可以使用 gsubfn 包完成同样的事情:

library(gsubfn)

pat <- paste(state.name, collapse="|")
gsubfn(pat, function(x) state.abb[match(x, state.name)], x)
[1] "Hobby Lobby (CA)"                                
[2] "Hello NY City, here I come (from Greensboro NC)!"