在 tensorflow 中使用 LSTM 进行分布式训练

Distributed training with LSTM in tensorflow

LSTM是算法还是节点？如果在模型中使用它，如果我使用分布式训练，反向传播是否会发生冲突？

LSTM 两者都不是。这是一个递归神经网络（参见this post). In terms of tensorflow, you might get confused, because there's a notion of a cell (e.g., BasicLSTMCell), that's basically a factory for creating cells that form one or several layers. In the end, it all translates to nodes in the computational graph. You can find a good usage example in this notebook。顺便说一下，训练算法是一样的——backprop。

现在，关于分布式训练，有两种类型的并行：数据并行和模型并行，其中 none 破坏了反向传播。唯一的例外可能是异步更新的数据并行性，这确实需要某些技巧才能起作用，但在 tensorflow 中没有 first-class 支持它。我认为您最好使用更简单的方法来分发您的模型（请参阅 this post）。所以答案很可能是：不，反向传播可以正常工作。

在 tensorflow 中使用 LSTM 进行分布式训练

Distributed training with LSTM in tensorflow

machine-learning

distributed-computing

backpropagation

lstm

tensorflow