x$j 中的错误:$ 运算符对于文本聚类中的原子向量无效
Error in x$j : $ operator is invalid for atomic vectors in text clustering
我做文本聚类。玩具示例
df = data.frame(a=c('aaa', 'bbb', 'ccc'))
corpus = Corpus(VectorSource(df$a))
clean = tm_map(corpus, removeWords, stopwords('english'))
clean = tm_map(corpus, stripWhitespace)
dtm = DocumentTermMatrix(clean)
tfidf = weightTfIdf(dtm)
m = as.matrix(tfidf)
rownames(m) = 1:nrow(m)
norm_eucl <- function(m) m/apply(m, MARGIN=1, FUN=function(x) sum(x^2)^.5)
m_norm <- norm_eucl(m)
cl <- kmeans(m_norm, 2)
和错误
findFreqTerms(tfidf[cl$cl==1], 2)
Error in x$j : $ operator is invalid for atomic vectors
我努力改进
findFreqTerms(tfidf[cl['cl']==1], 2)
Error in `[.simple_triplet_matrix`(tfidf, cl["cl"] == 1) :
(list) object cannot be coerced to type 'double'
怎么了?如何改善这一点?
正确答案是
findFreqTerms(tfidf[cl$cl==1,], 2)
我做文本聚类。玩具示例
df = data.frame(a=c('aaa', 'bbb', 'ccc'))
corpus = Corpus(VectorSource(df$a))
clean = tm_map(corpus, removeWords, stopwords('english'))
clean = tm_map(corpus, stripWhitespace)
dtm = DocumentTermMatrix(clean)
tfidf = weightTfIdf(dtm)
m = as.matrix(tfidf)
rownames(m) = 1:nrow(m)
norm_eucl <- function(m) m/apply(m, MARGIN=1, FUN=function(x) sum(x^2)^.5)
m_norm <- norm_eucl(m)
cl <- kmeans(m_norm, 2)
和错误
findFreqTerms(tfidf[cl$cl==1], 2)
Error in x$j : $ operator is invalid for atomic vectors
我努力改进
findFreqTerms(tfidf[cl['cl']==1], 2)
Error in `[.simple_triplet_matrix`(tfidf, cl["cl"] == 1) :
(list) object cannot be coerced to type 'double'
怎么了?如何改善这一点?
正确答案是
findFreqTerms(tfidf[cl$cl==1,], 2)