tfidfvectorizer.transform() 实际上产生了什么?

What does tfidfvectorizer.transform() actually produce?

我是 tf-idf 矢量化器的新手。虽然 运行 我想出了这个输出的代码,但无法解释它的实际含义。

代码

X=["Access modes govern the type of operations possible in the opened file. It refers to how the file will be used once its opened. These modes also define the location of the File Handle in the file.","File handle is like a cursor, which defines from where the data has to be read or written in the file. There are 6 access modes in python."]

X = np.array(X)

ans = tfidfvectorizer.transform(X)

print(ans)

**OUTPUT**

  (0, 247682)   0.34757472043242427

  (0, 235525)   0.11981132543319443

  (0, 232967)   0.27278177118815816

  (0, 165607)   0.6769351735727495

  (1, 247953)   0.2657562514567408

  (1, 232967)   0.2589999033874122

  (1, 230813)   0.28434013277955594

  (1, 202607)   0.22380408029504645

谁能告诉我 (0,247682)(1,247953) 是什么意思?

首先你的数据集中有两个句子。在这些句子中找到的每个单词都将分配一个单词 ID。

(0,247682)中:

0是文档id或者第一句,247682是单词id,0.34757472043242427是它的TF-IDF分数