如何过滤掉语料库中的所有短字符串（2 个及以下字符）？

How to filter out all short strings (2 and lower chars) in a corpus?

给定一个简单的字符串：

t <- "hello world ww ff a wr gj dkjffdkn kuku"

VCorpus(VectorSource(t))

我想过滤掉所有 2 和更短长度的子字符串。我如何使用 qdap 或 tm 包来做到这一点？我知道我可以为此使用 regex 但是否有函数可以做到这一点？

使用软件包 qdapRegex，您可以：

rm_nchar_words(t, "1,2")

[1] "hello world dkjffdkn kuku"