如何在不同类别下对输入文本进行分类
how to classify input text under different categories
文本= "my dog is a rice eater", "I want to buy an a new","my cat prefers chocolate milk"
我如何从这些文本(或文本语料库)中提取关键字并将它们分类为不同的类别(即狗、猫被归类为宠物和米饭、巧克力牛奶被归类为食物)
你被否决了,因为这个问题没有提供足够的细节来说明你的意思"classify",也因为你没有显示你希望达到的目标结果。
这里有一个基本的答案,但是:您可以创建一个字典并根据字典计算命中率。在 quanteda 中,它是这样工作的:
text <- c("my dog is a rice eater",
"I want to buy an a new",
"my cat prefers chocolate milk")
library("quanteda")
fooddict <- dictionary(list(pet = c("cat", "dog"),
food = c("rice", "chocolate milk")))
dfm(text, dictionary = fooddict)
# Document-feature matrix of: 3 documents, 2 features (33.3% sparse).
# 3 x 2 sparse Matrix of class "dfmSparse"
# features
# docs pet food
# text1 1 1
# text2 0 0
# text3 1 1
文本= "my dog is a rice eater", "I want to buy an a new","my cat prefers chocolate milk"
我如何从这些文本(或文本语料库)中提取关键字并将它们分类为不同的类别(即狗、猫被归类为宠物和米饭、巧克力牛奶被归类为食物)
你被否决了,因为这个问题没有提供足够的细节来说明你的意思"classify",也因为你没有显示你希望达到的目标结果。
这里有一个基本的答案,但是:您可以创建一个字典并根据字典计算命中率。在 quanteda 中,它是这样工作的:
text <- c("my dog is a rice eater",
"I want to buy an a new",
"my cat prefers chocolate milk")
library("quanteda")
fooddict <- dictionary(list(pet = c("cat", "dog"),
food = c("rice", "chocolate milk")))
dfm(text, dictionary = fooddict)
# Document-feature matrix of: 3 documents, 2 features (33.3% sparse).
# 3 x 2 sparse Matrix of class "dfmSparse"
# features
# docs pet food
# text1 1 1
# text2 0 0
# text3 1 1