如何查看TF-IDF结果?

How to view TF-IDF results?

我正在看这个例子

https://www.analyticsvidhya.com/blog/2019/04/predicting-movie-genres-nlp-multi-label-classification/

正好在使用 TF-IDF 的那一行

# create TF-IDF features
xtrain_tfidf = tfidf_vectorizer.fit_transform(xtrain)
xval_tfidf = tfidf_vectorizer.transform(xval)

当我尝试查看 xtrain_tfidf 的结果时,我收到此消息

xtrain_tfidf
Out[69]: 
<33434x10000 sparse matrix of type '<class 'numpy.float64'>'
    with 3494870 stored elements in Compressed Sparse Row format>

我想看看xtrain_tfidf有什么?

如何查看?

Jupyter(或者更确切地说是 IPython(或者更确切地说是 Python REPL))在评估变量名称时隐式调用 xtrain_tfidf.__repr__()。使用 print 调用 xtrain_tfidf.__str__(),这就是您想要在稀疏矩阵中查看非零值时要查找的内容:

print(xtrain_tfidf)

如果你想打印所有内容,包括零值、缓慢和可能的内存不足,那么试试

import numpy as np

with np.printoptions(threshold=np.inf):
    print(xtrain_tfidf.toarray())