为R中的字符向量制作for循环
making for loop for character vector in R
char_vector <- c("Africa", "identical", "ending" ,"aa" ,"bb", "rain" ,"Friday" ,"transport") # character vector
假设我有上面的字符向量
我想创建一个 for 循环以仅在屏幕上打印向量中超过 5 个字符且以元音开头的元素
并从向量中删除那些不以元音开头的元素
我创建了这个 for 循环,但它也给出了空字符
for (i in char_vector){
if (str_length(i) > 5){
i <- str_subset(i, "^[AEIOUaeiou]")
print(i)
}
}
- 上面的结果是
[1] "Africa"
[1] "identical"
[1] "ending"
character(0)
character(0)
我想要的结果只会是前 3 个字符
我真的是 R 的新手,在为这个问题创建 for 循环时遇到了巨大的困难。任何帮助将不胜感激!
使用 grepl
和模式 ^[AEIOUaeiuo]\w{5,}$
:
char_vector <- c("Africa", "identical", "ending" ,"aa" ,"bb", "rain" ,"Friday" ,"transport")
char_vector <- char_vector[grepl("^[AEIOUaeiuo]\w{5,}$", char_vector)]
char_vector
[1] "Africa" "identical" "ending"
此处使用的正则表达式模式表示匹配以下单词:
^ from the start of the word
[AEIOUaeiuo] starts with a vowel
\w{5,} followed by 5 or more characters (total length > 5)
$ end of the word
前 3 个字符?
library(stringr)
for (i in char_vector){
if (str_length(i) > 5 & str_detect(i, "^[AEIOUaeiou]")) {
word <- str_sub(i, 1, 3)
print(word)
}
}
输出为:
[1] "Afr"
[1] "ide"
[1] "end"
对于 stringr
函数,您宁愿使用 str_detect
而不是 str_subset
,并且您可以利用这些函数被向量化的事实:
library(stringr)
char_vector[str_length(char_vector) > 5 & str_detect(char_vector, "^[AEIOUaeiou]")]
#[1] "Africa" "identical" "ending"
或者如果您希望将 for 循环作为单个向量:
vec <- c()
for (i in char_vector){
if (str_length(i) > 5 & str_detect(i, "^[AEIOUaeiou]")){
vec <- c(vec, i)
}
}
vec
# [1] "Africa" "identical" "ending"
仅使用基本 R 函数。不需要循环。我将这些步骤包装在一个函数中,以便您可以将该函数与其他字符向量一起使用。您可以缩短此代码 (),但我觉得使用“一行一步”的方法更容易理解该过程。
char_vector <- c("Africa", "identical", "ending" ,"aa" ,"bb", "rain" ,"Friday" ,"transport")
yourfun <- function(char_vector){
char_vector <- char_vector[nchar(char_vector)>= 5] # grab only the strings that are at least 5 characters long
char_vector <- char_vector[grep(pattern = "^[AEIOUaeiou]", char_vector)] # grab strings that starts with vowel
return(char_vector) # print the first three strings
# remove comments to get the first three characters of each string
# out <- substring(char_vector, 1, 3) # select only the first 3 characters of each string
# return(out)
}
yourfun(char_vector = char_vector)
#> [1] "Africa" "identical" "ending"
由 reprex package (v2.0.1)
于 2022-05-09 创建
你不需要 for 循环,因为我们在 R 中使用 vectorized 函数。
使用grep
和substr
的简单解决方案(详见):
substr(grep('^[aeiu].{4}', char_vector, T, , T), 1, 3)
# [1] "Afr" "ide" "end"
char_vector <- c("Africa", "identical", "ending" ,"aa" ,"bb", "rain" ,"Friday" ,"transport") # character vector
假设我有上面的字符向量 我想创建一个 for 循环以仅在屏幕上打印向量中超过 5 个字符且以元音开头的元素 并从向量中删除那些不以元音开头的元素
我创建了这个 for 循环,但它也给出了空字符
for (i in char_vector){
if (str_length(i) > 5){
i <- str_subset(i, "^[AEIOUaeiou]")
print(i)
}
}
- 上面的结果是
[1] "Africa"
[1] "identical"
[1] "ending"
character(0)
character(0)
我想要的结果只会是前 3 个字符
我真的是 R 的新手,在为这个问题创建 for 循环时遇到了巨大的困难。任何帮助将不胜感激!
使用 grepl
和模式 ^[AEIOUaeiuo]\w{5,}$
:
char_vector <- c("Africa", "identical", "ending" ,"aa" ,"bb", "rain" ,"Friday" ,"transport")
char_vector <- char_vector[grepl("^[AEIOUaeiuo]\w{5,}$", char_vector)]
char_vector
[1] "Africa" "identical" "ending"
此处使用的正则表达式模式表示匹配以下单词:
^ from the start of the word
[AEIOUaeiuo] starts with a vowel
\w{5,} followed by 5 or more characters (total length > 5)
$ end of the word
前 3 个字符?
library(stringr)
for (i in char_vector){
if (str_length(i) > 5 & str_detect(i, "^[AEIOUaeiou]")) {
word <- str_sub(i, 1, 3)
print(word)
}
}
输出为:
[1] "Afr"
[1] "ide"
[1] "end"
对于 stringr
函数,您宁愿使用 str_detect
而不是 str_subset
,并且您可以利用这些函数被向量化的事实:
library(stringr)
char_vector[str_length(char_vector) > 5 & str_detect(char_vector, "^[AEIOUaeiou]")]
#[1] "Africa" "identical" "ending"
或者如果您希望将 for 循环作为单个向量:
vec <- c()
for (i in char_vector){
if (str_length(i) > 5 & str_detect(i, "^[AEIOUaeiou]")){
vec <- c(vec, i)
}
}
vec
# [1] "Africa" "identical" "ending"
仅使用基本 R 函数。不需要循环。我将这些步骤包装在一个函数中,以便您可以将该函数与其他字符向量一起使用。您可以缩短此代码 (
char_vector <- c("Africa", "identical", "ending" ,"aa" ,"bb", "rain" ,"Friday" ,"transport")
yourfun <- function(char_vector){
char_vector <- char_vector[nchar(char_vector)>= 5] # grab only the strings that are at least 5 characters long
char_vector <- char_vector[grep(pattern = "^[AEIOUaeiou]", char_vector)] # grab strings that starts with vowel
return(char_vector) # print the first three strings
# remove comments to get the first three characters of each string
# out <- substring(char_vector, 1, 3) # select only the first 3 characters of each string
# return(out)
}
yourfun(char_vector = char_vector)
#> [1] "Africa" "identical" "ending"
由 reprex package (v2.0.1)
于 2022-05-09 创建你不需要 for 循环,因为我们在 R 中使用 vectorized 函数。
使用grep
和substr
的简单解决方案(详见
substr(grep('^[aeiu].{4}', char_vector, T, , T), 1, 3)
# [1] "Afr" "ide" "end"