使用 janitor::make_clean_names() 自定义清理字符串向量

Question

我有一个向量，其中包含数据框的列名。我想清理那些字符串。

vec_of_names <- c("FIRST_column", 
                  "another-column", 
                  "ALLCAPS-column", 
                  "cOLumn-with___specialsuffix", 
                  "blah#4-column",
                  "ANOTHER_EXAMPLE___specialsuffix",
                  "THIS_IS-Misleading_specialsuffix")

我特别想使用 janitor::make_clean_names() 进行清理。

janitor::make_clean_names(vec_of_names)

[1] "first_column"                     "another_column"                  
[3] "allcaps_column"                   "c_o_lumn_with_specialsuffix"     
[5] "blah_number_4_column"             "another_example_specialsuffix"   
[7] "this_is_misleading_specialsuffix"

但是， 我想应用以下规则：

当字符串以___specialsuffix结尾时（即3个下划线和“specialsuffix”），
- 仅使用 janitor::make_clean_names() 清除 ___specialsuffix
  之前的部分（意思是从 strsplit(x, "___specialsuffix") 返回的值）。
- 然后将清理后的字符串粘贴回 ___specialsuffix。
否则，如果字符串不以 ___specialsuffix 结尾，则在整个字符串上使用 janitor::make_clean_names() 定期清理它。

所需的输出因此将是：

[1] "first_column"                     "another_column"                  
[3] "allcaps_column"                   "c_o_lumn_with___specialsuffix"     ## elements [4] and [6]
[5] "blah_number_4_column"             "another_example___specialsuffix"   ## were handled according to rule #1
[7] "this_is_misleading_specialsuffix"                                     ## outlined above

非常感谢任何想法！

Answer 1

vec_of_names <- c("FIRST_column", 
                  "another-column", 
                  "ALLCAPS-column", 
                  "cOLumn-with___specialsuffix", 
                  "blah#4-column",
                  "ANOTHER_EXAMPLE___specialsuffix",
                  "THIS_IS-Misleading_specialsuffix")


library(tidyverse)

suffix <- vec_of_names %>% str_extract(pattern = "___specialsuffix$") %>% replace_na("")
cleaned_without_suffix <- vec_of_names %>% str_remove("___specialsuffix$") %>% janitor::make_clean_names()


output <- paste0(cleaned_without_suffix, suffix)

使用 janitor::make_clean_names() 自定义清理字符串向量

Custom cleaning a vector of strings using janitor::make_clean_names()

string

r

janitor