为什么词嵌入实际上是向量？

Question

对不起我的天真，但我不明白为什么作为 NN 训练过程 (word2vec) 结果的词嵌入实际上是向量。

Embedding是降维的过程，在训练过程中NN将单词的1/0数组缩减为更小的数组，这个过程没有应用向量算法。

因此我们得到的只是数组而不是向量。为什么我应该将这些数组视为向量？

虽然我们得到了向量，但为什么每个人都将它们描述为来自原点 (0,0) 的向量？

再次抱歉，如果我的问题看起来很愚蠢。

Answer 1

the process does nothing that applies vector arithmetic

训练过程与向量运算无关，但是当数组产生时，结果发现它们具有很好的性质，所以可以想到"word linear space".

例如，在此 space 中哪些词的嵌入最接近给定词？

换句话说，意思相近的词就是云。这是二维 t-SNE 表示：

再举个例子，"man"和"woman"的距离和"uncle"和"aunt"的距离很接近：

因此，您的算术相当合理：

W("woman") − W("man") ≃ W("aunt") − W("uncle")
W("woman") − W("man") ≃ W("queen") − W("king")

所以称它们为向量并不牵强。所有图片均来自this wonderful post，非常推荐阅读

Answer 2

每个单词映射到 d-dimension space 中的一个点（d 通常为 300 或 600，但不是必需的），因此它被称为向量（d-dim space 中的每个点只不过是 d-dim space 中的一个向量。

这些点有一些很好的属性（具有相似含义的词往往彼此靠近）[使用 2 词向量之间的余弦距离来测量接近度]

Answer 3

什么是嵌入？

Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers.

Conceptually it involves a mathematical embedding from a space with one dimension per word to a continuous vector space with much lower dimension.

（来源：https://en.wikipedia.org/wiki/Word_embedding）

什么是 Word2Vec？

Word2vec is a group of related models that are used to produce word embeddings. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic contexts of words.

Word2vec takes as its input a large corpus of text and produces a vector space, typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space.

Word vectors are positioned in the vector space such that words that share common contexts in the corpus are located in close proximity to one another in the space.

（来源：https://en.wikipedia.org/wiki/Word2vec）

什么是数组？

In computer science, an array data structure, or simply an array, is a data structure consisting of a collection of elements (values or variables), each identified by at least one array index or key.

An array is stored so that the position of each element can be computed from its index tuple by a mathematical formula.

The simplest type of data structure is a linear array, also called one-dimensional array.

什么是向量/矢量space?

A vector space (also called a linear space) is a collection of objects called vectors, which may be added together and multiplied ("scaled") by numbers, called scalars.

Scalars are often taken to be real numbers, but there are also vector spaces with scalar multiplication by complex numbers, rational numbers, or generally any field.

The operations of vector addition and scalar multiplication must satisfy certain requirements, called axioms, listed below.

（来源：https://en.wikipedia.org/wiki/Vector_space）

向量和数组有什么区别？

首先，词嵌入中的向量不完全是编程语言的数据结构（所以它不是Arrays vs Vectors: Introductory Similarities and Differences）。