`paste`、`str_c`、`str_join`、`stri_join`、`stri_c`、`stri_paste` 之间的区别?

Difference between `paste`, `str_c`, `str_join`, `stri_join`, `stri_c`, `stri_paste`?

所有这些看起来非常相似的功能之间有什么区别?

  • stri_joinstri_cstri_paste 来自包 stringi 并且是纯别名

  • str_c 来自 stringr 只是 stringi::stri_join 带有硬编码的参数 ignore_null TRUEstringi::stri_join 默认设置为 FALSEstringr::str_joinstr_c

  • 的弃用别名

见:

library(stringi)
identical(stri_join, stri_c)
# [1] TRUE
identical(stri_join, stri_paste)
# [1] TRUE

library(stringr)
str_c
# function (..., sep = "", collapse = NULL) 
# {
#   stri_c(..., sep = sep, collapse = collapse, ignore_null = TRUE)
# }
# <environment: namespace:stringr>

stri_joinbase::paste 非常相似,下面列举了一些差异:


1. sep = "" 默认

所以默认情况下它的行为更像 paste0,但是 paste0 失去了它的 sep 参数。

identical(paste0("a","b")        , stri_join("a","b"))
# [1] TRUE
identical(paste("a","b")         , stri_join("a","b",sep=" "))
# [1] TRUE
identical(paste("a","b", sep="-"), stri_join("a","b", sep="-"))
# [1] TRUE

str_c 的行为与此处的 stri_join 相同。


2。 NA

的行为

如果您使用 stri_join 粘贴到 NA,结果是 NA,而 pasteNA 转换为 "NA"

paste0(c("a","b"),c("c",NA))
# [1] "ac"  "bNA"
stri_join(c("a","b"),c("c",NA))
# [1] "ac" NA

str_c 的行为也与 stri_join 相同


3。长度为 0 个参数的行为

当遇到长度为 0 的值时,返回 character(0),除非 ignore_null 设置为 FALSE,然后忽略该值。它不同于 paste 的行为,后者将长度 0 值转换为 "",因此在输出中包含 2 个连续的分隔符。

stri_join("a",NULL, "b")  
# [1] character(0)
stri_join("a",character(0), "b")  
# [1] character(0)

paste0("a",NULL, "b")
# [1] "ab"
stri_join("a",NULL, "b", ignore_null = TRUE)
# [1] "ab"
str_c("a",NULL, "b")
# [1] "ab"

paste("a",NULL, "b") # produces double space!
# [1] "a  b" 
stri_join("a",NULL, "b", ignore_null = TRUE, sep = " ")
# [1] "a b"
str_c("a",NULL, "b", sep = " ")
# [1] "a b"

4. stri_join 警告更多

paste(c("a","b"),c("c","d","e"))
# [1] "a c" "b d" "a e"
paste("a","b", sep = c(" ","-"))
# [1] "a b"

stri_join(c("a","b"),c("c","d","e"), sep = " ")
# [1] "a c" "b d" "a e"
# Warning message:
#   In stri_join(c("a", "b"), c("c", "d", "e"), sep = " ") :
#   longer object length is not a multiple of shorter object length
stri_join("a","b", sep = c(" ","-"))
# [1] "a b"
# Warning message:
#   In stri_join("a", "b", sep = c(" ", "-")) :
#   argument `sep` should be one character string; taking the first one

5. stri_join更快

microbenchmark::microbenchmark(
  stringi = stri_join(rep("a",1000000),rep("b",1000),"c",sep=" "),
  base    = paste(rep("a",1000000),rep("b",1000),"c")
)

# Unit: milliseconds
#    expr       min       lq      mean    median       uq      max neval cld
# stringi  88.54199  93.4477  97.31161  95.17157  96.8879 131.9737   100  a 
# base    166.01024 169.7189 178.31065 171.30910 176.3055 215.5982   100   b