使用 gsub 和 mapply 从另一个不同长度的词向量中删除一个词向量
Using gsub and mapply to remove a vector of words from another vector of words of different lengths
我有一个词向量,我想从另一个词向量中删除。我正在使用 mapply 和 gsub,但收到错误“较长的参数不是较短的长度的倍数”。
sw_column <- c(stop_words$word)
head(sw_column)
[1] "a" "a's" "able" "about" "above" "according"
x <- c(amplification.words, deamplification.words, negation.words)
head(x)
[1] "acute" "acutely" "certain" "certainly" "colossal" "colossally"
stop_words_clean <- mapply(gsub, x, "", sw_column)
error message: longer argument not a multiple of length of shorter
我想从 sw_column 中删除 x 中的所有单词。注意:并非所有x中的词都出现在sw_column
中
只是猜测,但是“x”(第一个参数)中的 setdiff(x, y)
returns 个元素不在“y”(第二个参数)中。所以,
stop_words_clean <- setdiff(sw_column, x)
可能就是你想要的。
示例:
sw_column <- c("a", "a's","able","about", "above","according")
x <- c("a", "able", "above")
setdiff(sw_column, x)
#[1] "a's" "about" "according"
至于 gsub
,该函数 修改 字符向量的元素,这不是您声明的 objective。
如果你想将一个文本向量过滤成另一个你可以使用下面的代码,我使用了一些虚构的向量来解释我自己。
stop_words_example <- c("a", "a's", "able", "about", "above", "according")
x <- c("a", "a's", "able", "about", "above", "according", "acute", "acutely", "certain", "certainly", "colossal", "colossally")
x[!x %in% stop_words_example]
[1] "acute" "acutely" "certain" "certainly" "colossal" "colossally"
我有一个词向量,我想从另一个词向量中删除。我正在使用 mapply 和 gsub,但收到错误“较长的参数不是较短的长度的倍数”。
sw_column <- c(stop_words$word)
head(sw_column)
[1] "a" "a's" "able" "about" "above" "according"
x <- c(amplification.words, deamplification.words, negation.words)
head(x)
[1] "acute" "acutely" "certain" "certainly" "colossal" "colossally"
stop_words_clean <- mapply(gsub, x, "", sw_column)
error message: longer argument not a multiple of length of shorter
我想从 sw_column 中删除 x 中的所有单词。注意:并非所有x中的词都出现在sw_column
中只是猜测,但是“x”(第一个参数)中的 setdiff(x, y)
returns 个元素不在“y”(第二个参数)中。所以,
stop_words_clean <- setdiff(sw_column, x)
可能就是你想要的。
示例:
sw_column <- c("a", "a's","able","about", "above","according")
x <- c("a", "able", "above")
setdiff(sw_column, x)
#[1] "a's" "about" "according"
至于 gsub
,该函数 修改 字符向量的元素,这不是您声明的 objective。
如果你想将一个文本向量过滤成另一个你可以使用下面的代码,我使用了一些虚构的向量来解释我自己。
stop_words_example <- c("a", "a's", "able", "about", "above", "according")
x <- c("a", "a's", "able", "about", "above", "according", "acute", "acutely", "certain", "certainly", "colossal", "colossally")
x[!x %in% stop_words_example]
[1] "acute" "acutely" "certain" "certainly" "colossal" "colossally"