K 均值示例 (tf.expand_dims)
K-means example(tf.expand_dims)
在Tensorflow的Kmeans示例代码中,
在point_expanded、centroids_expanded中使用函数'tf.expand_dims'(将1维插入张量的形状。)
在计算 tf.reduce_sum.
之前
为什么它们在第二个参数中有不同的索引 (0, 1)?
import numpy as np
import tensorflow as tf
points_n = 200
clusters_n = 3
iteration_n = 100
points = tf.constant(np.random.uniform(0, 10, (points_n, 2)))
centroids = tf.Variable(tf.slice(tf.random_shuffle(points), [0, 0],[clusters_n, -1]))
points_expanded = tf.expand_dims(points, 0)
centroids_expanded = tf.expand_dims(centroids, 1)
distances = tf.reduce_sum(tf.square(tf.subtract(points_expanded, centroids_expanded)), 2)
assignments = tf.argmin(distances, 0)
means = []
for c in range(clusters_n):
means.append(tf.reduce_mean(tf.gather(points,tf.reshape(tf.where(tf.equal(assignments, c)), [1, -1])), reduction_indices=[1]))
new_centroids = tf.concat(means,0)
update_centroids = tf.assign(centroids, new_centroids)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for step in range(iteration_n):
[_, centroid_values, points_values, assignment_values] = sess.run([update_centroids, centroids, points, assignments])
print("centroids" + "\n", centroid_values)
plt.scatter(points_values[:, 0], points_values[:, 1], c=assignment_values, s=50, alpha=0.5)
plt.plot(centroid_values[:, 0], centroid_values[:, 1], 'kx', markersize=15)
plt.show()
这样做是为了从每个点中减去每个质心。首先,确保您了解广播的概念 (https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
从 tf.subtract (https://www.tensorflow.org/api_docs/python/tf/subtract) 链接。然后,您只需要画出 points
、expanded_points
、centroids
和 expanded_centroids
的形状,并了解哪些值在何处得到 "broadcast"。一旦你这样做了,你会看到广播允许你准确地计算你想要的东西——从每个质心减去每个点。
作为完整性检查,因为有 200 个点、3 个质心,并且每个都是 2D,所以我们应该有 200*3*2 个差异。这正是我们得到的:
In [53]: points
Out[53]: <tf.Tensor 'Const:0' shape=(200, 2) dtype=float64>
In [54]: points_expanded
Out[54]: <tf.Tensor 'ExpandDims_4:0' shape=(1, 200, 2) dtype=float64>
In [55]: centroids
Out[55]: <tf.Variable 'Variable:0' shape=(3, 2) dtype=float64_ref>
In [56]: centroids_expanded
Out[56]: <tf.Tensor 'ExpandDims_5:0' shape=(3, 1, 2) dtype=float64>
In [57]: tf.subtract(points_expanded, centroids_expanded)
Out[57]: <tf.Tensor 'Sub_5:0' shape=(3, 200, 2) dtype=float64>
如果您在绘制形状时遇到问题,您可以考虑将尺寸为 (1, 200, 2)
的 expanded_points
广播到尺寸 (3, 200, 2)
,就像沿第一个维度复制 200x2 矩阵 3 次. centroids_expanded
中的 3x2 矩阵(形状为 (3, 1, 2))沿第二个维度被复制了 200 次。
在Tensorflow的Kmeans示例代码中,
在point_expanded、centroids_expanded中使用函数'tf.expand_dims'(将1维插入张量的形状。) 在计算 tf.reduce_sum.
之前为什么它们在第二个参数中有不同的索引 (0, 1)?
import numpy as np
import tensorflow as tf
points_n = 200
clusters_n = 3
iteration_n = 100
points = tf.constant(np.random.uniform(0, 10, (points_n, 2)))
centroids = tf.Variable(tf.slice(tf.random_shuffle(points), [0, 0],[clusters_n, -1]))
points_expanded = tf.expand_dims(points, 0)
centroids_expanded = tf.expand_dims(centroids, 1)
distances = tf.reduce_sum(tf.square(tf.subtract(points_expanded, centroids_expanded)), 2)
assignments = tf.argmin(distances, 0)
means = []
for c in range(clusters_n):
means.append(tf.reduce_mean(tf.gather(points,tf.reshape(tf.where(tf.equal(assignments, c)), [1, -1])), reduction_indices=[1]))
new_centroids = tf.concat(means,0)
update_centroids = tf.assign(centroids, new_centroids)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for step in range(iteration_n):
[_, centroid_values, points_values, assignment_values] = sess.run([update_centroids, centroids, points, assignments])
print("centroids" + "\n", centroid_values)
plt.scatter(points_values[:, 0], points_values[:, 1], c=assignment_values, s=50, alpha=0.5)
plt.plot(centroid_values[:, 0], centroid_values[:, 1], 'kx', markersize=15)
plt.show()
这样做是为了从每个点中减去每个质心。首先,确保您了解广播的概念 (https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
从 tf.subtract (https://www.tensorflow.org/api_docs/python/tf/subtract) 链接。然后,您只需要画出 points
、expanded_points
、centroids
和 expanded_centroids
的形状,并了解哪些值在何处得到 "broadcast"。一旦你这样做了,你会看到广播允许你准确地计算你想要的东西——从每个质心减去每个点。
作为完整性检查,因为有 200 个点、3 个质心,并且每个都是 2D,所以我们应该有 200*3*2 个差异。这正是我们得到的:
In [53]: points
Out[53]: <tf.Tensor 'Const:0' shape=(200, 2) dtype=float64>
In [54]: points_expanded
Out[54]: <tf.Tensor 'ExpandDims_4:0' shape=(1, 200, 2) dtype=float64>
In [55]: centroids
Out[55]: <tf.Variable 'Variable:0' shape=(3, 2) dtype=float64_ref>
In [56]: centroids_expanded
Out[56]: <tf.Tensor 'ExpandDims_5:0' shape=(3, 1, 2) dtype=float64>
In [57]: tf.subtract(points_expanded, centroids_expanded)
Out[57]: <tf.Tensor 'Sub_5:0' shape=(3, 200, 2) dtype=float64>
如果您在绘制形状时遇到问题,您可以考虑将尺寸为 (1, 200, 2)
的 expanded_points
广播到尺寸 (3, 200, 2)
,就像沿第一个维度复制 200x2 矩阵 3 次. centroids_expanded
中的 3x2 矩阵(形状为 (3, 1, 2))沿第二个维度被复制了 200 次。