使用 janitor::make_clean_names() 自定义清理字符串向量
Custom cleaning a vector of strings using janitor::make_clean_names()
我有一个向量,其中包含数据框的列名。我想清理那些字符串。
vec_of_names <- c("FIRST_column",
"another-column",
"ALLCAPS-column",
"cOLumn-with___specialsuffix",
"blah#4-column",
"ANOTHER_EXAMPLE___specialsuffix",
"THIS_IS-Misleading_specialsuffix")
我特别想使用 janitor::make_clean_names()
进行清理。
janitor::make_clean_names(vec_of_names)
[1] "first_column" "another_column"
[3] "allcaps_column" "c_o_lumn_with_specialsuffix"
[5] "blah_number_4_column" "another_example_specialsuffix"
[7] "this_is_misleading_specialsuffix"
但是, 我想应用以下规则:
当字符串以___specialsuffix
结尾时(即3个下划线和“specialsuffix”),
仅使用 janitor::make_clean_names()
清除 ___specialsuffix
之前的部分(意思是从 strsplit(x, "___specialsuffix")
返回的值)。
然后将清理后的字符串粘贴回 ___specialsuffix
。
否则,如果字符串不以 ___specialsuffix
结尾,则在整个字符串上使用 janitor::make_clean_names()
定期清理它。
所需的输出因此将是:
[1] "first_column" "another_column"
[3] "allcaps_column" "c_o_lumn_with___specialsuffix" ## elements [4] and [6]
[5] "blah_number_4_column" "another_example___specialsuffix" ## were handled according to rule #1
[7] "this_is_misleading_specialsuffix" ## outlined above
非常感谢任何想法!
vec_of_names <- c("FIRST_column",
"another-column",
"ALLCAPS-column",
"cOLumn-with___specialsuffix",
"blah#4-column",
"ANOTHER_EXAMPLE___specialsuffix",
"THIS_IS-Misleading_specialsuffix")
library(tidyverse)
suffix <- vec_of_names %>% str_extract(pattern = "___specialsuffix$") %>% replace_na("")
cleaned_without_suffix <- vec_of_names %>% str_remove("___specialsuffix$") %>% janitor::make_clean_names()
output <- paste0(cleaned_without_suffix, suffix)
我有一个向量,其中包含数据框的列名。我想清理那些字符串。
vec_of_names <- c("FIRST_column",
"another-column",
"ALLCAPS-column",
"cOLumn-with___specialsuffix",
"blah#4-column",
"ANOTHER_EXAMPLE___specialsuffix",
"THIS_IS-Misleading_specialsuffix")
我特别想使用 janitor::make_clean_names()
进行清理。
janitor::make_clean_names(vec_of_names)
[1] "first_column" "another_column"
[3] "allcaps_column" "c_o_lumn_with_specialsuffix"
[5] "blah_number_4_column" "another_example_specialsuffix"
[7] "this_is_misleading_specialsuffix"
但是, 我想应用以下规则:
当字符串以
___specialsuffix
结尾时(即3个下划线和“specialsuffix”),仅使用
janitor::make_clean_names()
清除___specialsuffix
之前的部分(意思是从strsplit(x, "___specialsuffix")
返回的值)。然后将清理后的字符串粘贴回
___specialsuffix
。
否则,如果字符串不以
___specialsuffix
结尾,则在整个字符串上使用janitor::make_clean_names()
定期清理它。
所需的输出因此将是:
[1] "first_column" "another_column"
[3] "allcaps_column" "c_o_lumn_with___specialsuffix" ## elements [4] and [6]
[5] "blah_number_4_column" "another_example___specialsuffix" ## were handled according to rule #1
[7] "this_is_misleading_specialsuffix" ## outlined above
非常感谢任何想法!
vec_of_names <- c("FIRST_column",
"another-column",
"ALLCAPS-column",
"cOLumn-with___specialsuffix",
"blah#4-column",
"ANOTHER_EXAMPLE___specialsuffix",
"THIS_IS-Misleading_specialsuffix")
library(tidyverse)
suffix <- vec_of_names %>% str_extract(pattern = "___specialsuffix$") %>% replace_na("")
cleaned_without_suffix <- vec_of_names %>% str_remove("___specialsuffix$") %>% janitor::make_clean_names()
output <- paste0(cleaned_without_suffix, suffix)