标签二值化不支持多输出目标数据
Multioutput target data is not supported with label binarization
我的 MultinomialNB 模型符合 K 折分割。
我已尝试使用 SMOTE(imblearn.over_sampling、lib)平衡数据
NB_pipeline = Pipeline([
('tfidf', TfidfVectorizer(stop_words=stop_words)),
('clf', OneVsRestClassifier(MultinomialNB(
fit_prior=True, class_prior=None))),
])
for train_indices, test_indices in k_fold.split(train_data):
train_sequencies = train_data.iloc[train_indices]['NAME']
label_train = train_data.iloc[train_indices][['SEARCH','OPTIONS_VOLUME', 'OPTIONS_QUANTITY', 'OPTIONS_PORTION',
'OPTIONS_WEIGHT', 'OPTIONS_SIZE', 'OPTIONS_CONCENTRATION',
'OPTIONS_CONTENT', 'OPTIONS_MANUFACTURER']]
test_sequencies = train_data.iloc[test_indices]['NAME']
label_test = train_data.iloc[test_indices][['SEARCH','OPTIONS_VOLUME', 'OPTIONS_QUANTITY', 'OPTIONS_PORTION',
'OPTIONS_WEIGHT', 'OPTIONS_SIZE', 'OPTIONS_CONCENTRATION',
'OPTIONS_CONTENT', 'OPTIONS_MANUFACTURER']]
NB_pipeline.fit(train_sequencies, label_train)
predictions = pipeline.predict(test_sequencies)
confusion += confusion_matrics(test_sequencies, label_test)
score = f1_score(test_sequencies, label_test)
score.append(score)
我期待多标签分类的交叉验证
OneVsRestClassifier 包括为每个 class 而不是每个目标安装一个 classifier。
自 MultinomialNB doesn't support Multioutput target data, you can fit one MultinomialNB per target by using MultiOutputClassifier。这是扩展 class 本身不支持多目标 classification 的简单策略。
NB_pipeline = Pipeline([
('tfidf', TfidfVectorizer(stop_words=stop_words)),
('clf', MultiOutputClassifier(MultinomialNB( fit_prior=True, class_prior=None))),])
如果你想为每个 class 安装一个 classifier(class 针对所有其他 classes 安装。)和每个目标:
NB_pipeline = Pipeline([
('tfidf', TfidfVectorizer(stop_words=stop_words)),
('clf', MultiOutputClassifier(OneVsRestClassifier(MultinomialNB( fit_prior=True, class_prior=None)))),])
我的 MultinomialNB 模型符合 K 折分割。
我已尝试使用 SMOTE(imblearn.over_sampling、lib)平衡数据
NB_pipeline = Pipeline([
('tfidf', TfidfVectorizer(stop_words=stop_words)),
('clf', OneVsRestClassifier(MultinomialNB(
fit_prior=True, class_prior=None))),
])
for train_indices, test_indices in k_fold.split(train_data):
train_sequencies = train_data.iloc[train_indices]['NAME']
label_train = train_data.iloc[train_indices][['SEARCH','OPTIONS_VOLUME', 'OPTIONS_QUANTITY', 'OPTIONS_PORTION',
'OPTIONS_WEIGHT', 'OPTIONS_SIZE', 'OPTIONS_CONCENTRATION',
'OPTIONS_CONTENT', 'OPTIONS_MANUFACTURER']]
test_sequencies = train_data.iloc[test_indices]['NAME']
label_test = train_data.iloc[test_indices][['SEARCH','OPTIONS_VOLUME', 'OPTIONS_QUANTITY', 'OPTIONS_PORTION',
'OPTIONS_WEIGHT', 'OPTIONS_SIZE', 'OPTIONS_CONCENTRATION',
'OPTIONS_CONTENT', 'OPTIONS_MANUFACTURER']]
NB_pipeline.fit(train_sequencies, label_train)
predictions = pipeline.predict(test_sequencies)
confusion += confusion_matrics(test_sequencies, label_test)
score = f1_score(test_sequencies, label_test)
score.append(score)
我期待多标签分类的交叉验证
OneVsRestClassifier 包括为每个 class 而不是每个目标安装一个 classifier。
自 MultinomialNB doesn't support Multioutput target data, you can fit one MultinomialNB per target by using MultiOutputClassifier。这是扩展 class 本身不支持多目标 classification 的简单策略。
NB_pipeline = Pipeline([
('tfidf', TfidfVectorizer(stop_words=stop_words)),
('clf', MultiOutputClassifier(MultinomialNB( fit_prior=True, class_prior=None))),])
如果你想为每个 class 安装一个 classifier(class 针对所有其他 classes 安装。)和每个目标:
NB_pipeline = Pipeline([
('tfidf', TfidfVectorizer(stop_words=stop_words)),
('clf', MultiOutputClassifier(OneVsRestClassifier(MultinomialNB( fit_prior=True, class_prior=None)))),])