Scikit-learn zip 参数 #1 必须支持迭代
Scikit-learn zip argument #1 must support iteration
我有以下管道来对语料库执行机器学习。它首先提取文本,使用 TfidfVectorizer
提取 n-gram,然后选择最佳特征。没有特征选择步骤,管道工作正常。但是,有了它,我得到了
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/sklearn/pipeline.py", line 90, in __init__
names, estimators = zip(*steps)
TypeError: zip argument #1 must support iteration
在 SGDClassifier()
。
pipeline = Pipeline([
# Use FeatureUnion to combine the features
('features', FeatureUnion(
transformer_list=[
# N-GRAMS
('ngrams', Pipeline([
('extractor', TextExtractor(normalized=True)), # returns a list of strings
('vectorizer', TfidfVectorizer(analyzer='word', strip_accents='ascii', use_idf=True, norm="l2", min_df=3, max_df=0.90)),
('feature_selection', SelectPercentile(score_func=chi2, percentile=70)),
])),
],,
)),
('clf', Pipeline([
SGDClassifier(n_jobs=-1, verbose=0)
])),
])
您的管道中似乎漏掉了一个标签
('clf', Pipeline([
SGDClassifier(n_jobs=-1, verbose=0)
])),
应该是
('clf', Pipeline([
('sgd', SGDClassifier(n_jobs=-1, verbose=0))
])),
我有以下管道来对语料库执行机器学习。它首先提取文本,使用 TfidfVectorizer
提取 n-gram,然后选择最佳特征。没有特征选择步骤,管道工作正常。但是,有了它,我得到了
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/sklearn/pipeline.py", line 90, in __init__
names, estimators = zip(*steps)
TypeError: zip argument #1 must support iteration
在 SGDClassifier()
。
pipeline = Pipeline([
# Use FeatureUnion to combine the features
('features', FeatureUnion(
transformer_list=[
# N-GRAMS
('ngrams', Pipeline([
('extractor', TextExtractor(normalized=True)), # returns a list of strings
('vectorizer', TfidfVectorizer(analyzer='word', strip_accents='ascii', use_idf=True, norm="l2", min_df=3, max_df=0.90)),
('feature_selection', SelectPercentile(score_func=chi2, percentile=70)),
])),
],,
)),
('clf', Pipeline([
SGDClassifier(n_jobs=-1, verbose=0)
])),
])
您的管道中似乎漏掉了一个标签
('clf', Pipeline([
SGDClassifier(n_jobs=-1, verbose=0)
])),
应该是
('clf', Pipeline([
('sgd', SGDClassifier(n_jobs=-1, verbose=0))
])),