Return 只有独特的词
Return only the unique words
假设我有一个字符串,我只希望句子中的唯一单词作为单独的元素
a = "an apple is an apple"
word <- function(a){
words<- c(strsplit(a,split = " "))
return(unique(words))
}
word(a)
这个returns
[[1]]
[1] "an" "apple" "is" "an" "apple"
我期望的输出是
'an','apple','is'
我做错了什么?非常感谢任何帮助
干杯!
你可以试试
unique(unlist(strsplita, " ")))
另一种可能的解决方案,基于stringr::str_split
:
library(tidyverse)
a %>% str_split("\s+") %>% unlist %>% unique
#> [1] "an" "apple" "is"
问题是将 strsplit(.)
包裹在 c(.)
中并没有改变它仍然是 list
的事实,并且 unique
将在 list-level,不是 word-level.
c(strsplit(rep(a, 2), "\s+"))
# [[1]]
# [1] "an" "apple" "is" "an" "apple"
# [[2]]
# [1] "an" "apple" "is" "an" "apple"
unique(c(strsplit(rep(a, 2), "\s+")))
# [[1]]
# [1] "an" "apple" "is" "an" "apple"
备选方案:
如果length(a)
总是1,那么也许
unique(strsplit(a, "\s+")[[1]])
# [1] "an" "apple" "is"
如果 length(a)
可以是 2 个或更多,并且您想要每个句子 的唯一单词列表 ,那么
a2 <- c("an apple is an apple", "a pear is a pear", "an orange is an orange")
lapply(strsplit(a2, "\s+"), unique)
# [[1]]
# [1] "an" "apple" "is"
# [[2]]
# [1] "a" "pear" "is"
# [[3]]
# [1] "an" "orange" "is"
(注意:这总是returns一个list
,不管输入中的句子数量是多少。)
如果 length(a)
可以是 2 个或更多,并且您想要一个独特的词 跨越所有句子 ,那么
unique(unlist(strsplit(a2, "\s+")))
# [1] "an" "apple" "is" "a" "pear" "orange"
(注:此方法在length(a)
为1时同样适用。)
假设我有一个字符串,我只希望句子中的唯一单词作为单独的元素
a = "an apple is an apple"
word <- function(a){
words<- c(strsplit(a,split = " "))
return(unique(words))
}
word(a)
这个returns
[[1]]
[1] "an" "apple" "is" "an" "apple"
我期望的输出是
'an','apple','is'
我做错了什么?非常感谢任何帮助
干杯!
你可以试试
unique(unlist(strsplita, " ")))
另一种可能的解决方案,基于stringr::str_split
:
library(tidyverse)
a %>% str_split("\s+") %>% unlist %>% unique
#> [1] "an" "apple" "is"
问题是将 strsplit(.)
包裹在 c(.)
中并没有改变它仍然是 list
的事实,并且 unique
将在 list-level,不是 word-level.
c(strsplit(rep(a, 2), "\s+"))
# [[1]]
# [1] "an" "apple" "is" "an" "apple"
# [[2]]
# [1] "an" "apple" "is" "an" "apple"
unique(c(strsplit(rep(a, 2), "\s+")))
# [[1]]
# [1] "an" "apple" "is" "an" "apple"
备选方案:
如果
length(a)
总是1,那么也许unique(strsplit(a, "\s+")[[1]]) # [1] "an" "apple" "is"
如果
length(a)
可以是 2 个或更多,并且您想要每个句子 的唯一单词列表 ,那么a2 <- c("an apple is an apple", "a pear is a pear", "an orange is an orange") lapply(strsplit(a2, "\s+"), unique) # [[1]] # [1] "an" "apple" "is" # [[2]] # [1] "a" "pear" "is" # [[3]] # [1] "an" "orange" "is"
(注意:这总是returns一个
list
,不管输入中的句子数量是多少。)如果
length(a)
可以是 2 个或更多,并且您想要一个独特的词 跨越所有句子 ,那么unique(unlist(strsplit(a2, "\s+"))) # [1] "an" "apple" "is" "a" "pear" "orange"
(注:此方法在
length(a)
为1时同样适用。)