x$j 中的错误：$ 运算符对于文本聚类中的原子向量无效

Question

我做文本聚类。玩具示例

df = data.frame(a=c('aaa', 'bbb', 'ccc')) 
corpus = Corpus(VectorSource(df$a))
clean = tm_map(corpus, removeWords, stopwords('english'))
clean = tm_map(corpus, stripWhitespace)
dtm = DocumentTermMatrix(clean)
tfidf = weightTfIdf(dtm)
m = as.matrix(tfidf)
rownames(m) = 1:nrow(m)
norm_eucl <- function(m) m/apply(m, MARGIN=1, FUN=function(x) sum(x^2)^.5)
m_norm <- norm_eucl(m)
cl <- kmeans(m_norm, 2)

和错误

findFreqTerms(tfidf[cl$cl==1], 2)
Error in x$j : $ operator is invalid for atomic vectors

我努力改进

findFreqTerms(tfidf[cl['cl']==1], 2)
Error in `[.simple_triplet_matrix`(tfidf, cl["cl"] == 1) : 
  (list) object cannot be coerced to type 'double'

怎么了？如何改善这一点？

Answer 1

正确答案是

findFreqTerms(tfidf[cl$cl==1,], 2)

x$j 中的错误：$ 运算符对于文本聚类中的原子向量无效

Error in x$j : $ operator is invalid for atomic vectors in text clustering

r

text

cluster-analysis