使字符向量的重复元素唯一,但不像 make.unique()
Makes duplicate elements of a character vector unique, but not like make.unique()
当多个参考文献具有相同的作者(或多个作者)和出版年份时,通常的做法是在年份之后包含一个小写字母。我正在为此寻找一个优雅的功能:
# what I have
have <- c("Dawkins (2008)",
"Dawkins (2008)",
"Stephenson (2008)")
# what I want
want <- c("Dawkins (2008a)",
"Dawkins (2008b)",
"Stephenson (2008)")
# this would do the job, but is not really what I want
make.unique(have)
#> [1] "Dawkins (2008)" "Dawkins (2008).1" "Stephenson (2008)"
由 reprex package (v2.0.1)
于 2022-02-24 创建
编辑:解决方案基于下面@akrun 的回答
library(dplyr)
library(stringr)
have <- c("Dawkins (2008)",
"Dawkins (2008)",
"Stephenson (2008)")
f <- function(x){
v1 <- ave(x, x, FUN = function(x) if(length(x) > 1) letters[seq_along(x)] else "")
stringr::str_replace(x, "\)", stringr::str_c(v1, ")"))
}
data.frame(ha = have) %>%
mutate(want = f(ha))
#> ha want
#> 1 Dawkins (2008) Dawkins (2008a)
#> 2 Dawkins (2008) Dawkins (2008b)
#> 3 Stephenson (2008) Stephenson (2008)
由 reprex package (v2.0.1)
于 2022-02-24 创建
我们可能会根据重复的length
提取出letters
(假设重复的长度不会大于26),然后使用str_replace
插入字母收盘前 )
library(stringr)
v1 <- ave(have, have, FUN = function(x)
if(length(x) > 1) letters[seq_along(x)] else "")
str_replace(have, "\)", str_c(v1, ")"))
[1] "Dawkins (2008a)" "Dawkins (2008b)" "Stephenson (2008)"
当多个参考文献具有相同的作者(或多个作者)和出版年份时,通常的做法是在年份之后包含一个小写字母。我正在为此寻找一个优雅的功能:
# what I have
have <- c("Dawkins (2008)",
"Dawkins (2008)",
"Stephenson (2008)")
# what I want
want <- c("Dawkins (2008a)",
"Dawkins (2008b)",
"Stephenson (2008)")
# this would do the job, but is not really what I want
make.unique(have)
#> [1] "Dawkins (2008)" "Dawkins (2008).1" "Stephenson (2008)"
由 reprex package (v2.0.1)
于 2022-02-24 创建编辑:解决方案基于下面@akrun 的回答
library(dplyr)
library(stringr)
have <- c("Dawkins (2008)",
"Dawkins (2008)",
"Stephenson (2008)")
f <- function(x){
v1 <- ave(x, x, FUN = function(x) if(length(x) > 1) letters[seq_along(x)] else "")
stringr::str_replace(x, "\)", stringr::str_c(v1, ")"))
}
data.frame(ha = have) %>%
mutate(want = f(ha))
#> ha want
#> 1 Dawkins (2008) Dawkins (2008a)
#> 2 Dawkins (2008) Dawkins (2008b)
#> 3 Stephenson (2008) Stephenson (2008)
由 reprex package (v2.0.1)
于 2022-02-24 创建我们可能会根据重复的length
提取出letters
(假设重复的长度不会大于26),然后使用str_replace
插入字母收盘前 )
library(stringr)
v1 <- ave(have, have, FUN = function(x)
if(length(x) > 1) letters[seq_along(x)] else "")
str_replace(have, "\)", str_c(v1, ")"))
[1] "Dawkins (2008a)" "Dawkins (2008b)" "Stephenson (2008)"