Quanteda:具有预定义特征集的文档特征矩阵
Quanteda: Document Feature Matrix with predefined set of features
我正在使用 quanteda 构建两个文档特征矩阵:
library(quanteda)
DFM1 <- dfm("this is a rock")
# features
# docs this is a rock
# text1 1 1 1 1
DFM2 <- dfm("this is music")
# features
# docs this is music
# text1 1 1 1
但是,我希望 DFM2 具有一组特定的功能,即来自 DFM1 的功能:
DFM2 <- dfm("this is music", *magicargument* = featnames(DFM1))
# features
# docs this is a rock
# text1 1 1 0 0
有没有我遗漏的神奇论点?还是有另一种有效的方法来为大袋词归档它?
魔术参数是 pattern
,您可以在其中提供其特征将匹配的 dfm(包括不在目标 dfm 中的特征的零):
dfm_select(DFM2, pattern = DFM1)
# Document-feature matrix of: 1 document, 4 features (50% sparse).
# 1 x 4 sparse Matrix of class "dfmSparse"
# features
# docs this is a rock
# text1 1 1 0 0
我正在使用 quanteda 构建两个文档特征矩阵:
library(quanteda)
DFM1 <- dfm("this is a rock")
# features
# docs this is a rock
# text1 1 1 1 1
DFM2 <- dfm("this is music")
# features
# docs this is music
# text1 1 1 1
但是,我希望 DFM2 具有一组特定的功能,即来自 DFM1 的功能:
DFM2 <- dfm("this is music", *magicargument* = featnames(DFM1))
# features
# docs this is a rock
# text1 1 1 0 0
有没有我遗漏的神奇论点?还是有另一种有效的方法来为大袋词归档它?
魔术参数是 pattern
,您可以在其中提供其特征将匹配的 dfm(包括不在目标 dfm 中的特征的零):
dfm_select(DFM2, pattern = DFM1)
# Document-feature matrix of: 1 document, 4 features (50% sparse).
# 1 x 4 sparse Matrix of class "dfmSparse"
# features
# docs this is a rock
# text1 1 1 0 0