从 dfm 转换为 dtm

Question

我尝试使用 [此处][1] 中报告的一致性度量计算。

我在 quanteda 工作，所以我有 dfm

然而在link中使用了一个dtm： #创建 DTM

dtm <- CreateDtm(tokens$text, 
                 doc_names = tokens$ID, 
                 ngram_window = c(1, 2))
#explore the basic frequency
tf <- TermDocFreq(dtm = dtm)
original_tf <- tf %>% select(term, term_freq,doc_freq)
rownames(original_tf) <- 1:nrow(original_tf)
# Eliminate words appearing less than 2 times or in more than half of the
# documents
vocabulary <- tf$term[ tf$term_freq > 1 & tf$doc_freq < nrow(dtm) / 2 ]
dtm = dtm

如何在此计算中使用 dfm 选项而不是 dtm

更具体地说，如何使用 dfm 和 dtm 选项创建词汇表？ [1]: https://towardsdatascience.com/beginners-guide-to-lda-topic-modelling-with-r-e57a5a8e7a25

Answer 1

你想要convert()。例如

convert(yourdfm, to = "topicmodels")

或

convert(yourdfm, to = "tm")

见?convert。

从 dfm 转换为 dtm

Convert from dfm to dtm

r

quanteda