Quanteda:具有预定义特征集的文档特征矩阵

Quanteda: Document Feature Matrix with predefined set of features

我正在使用 quanteda 构建两个文档特征矩阵:

library(quanteda)
DFM1 <- dfm("this is a rock")
#        features
# docs    this is a rock
#   text1    1  1 1    1
DFM2 <- dfm("this is music")
#        features
# docs    this is music
#   text1    1  1     1

但是,我希望 DFM2 具有一组特定的功能,即来自 DFM1 的功能:

DFM2 <- dfm("this is music", *magicargument* = featnames(DFM1))
#        features
# docs    this is a rock
#   text1    1  1 0    0

有没有我遗漏的神奇论点?还是有另一种有效的方法来为大袋词归档它?

魔术参数是 pattern,您可以在其中提供其特征将匹配的 dfm(包括不在目标 dfm 中的特征的零):

dfm_select(DFM2, pattern = DFM1)
# Document-feature matrix of: 1 document, 4 features (50% sparse).
# 1 x 4 sparse Matrix of class "dfmSparse"
#        features
# docs    this is a rock
#   text1    1  1 0    0