避免局部最优训练 XOR

Avoid local optima training XOR

我正在使用 Neataptic 来训练一个神经网络,该神经网络使用遗传算法解决 XOR。适应度定义如下:

// max score = 0
score -= Math.abs(0 - network.activate([0, 0])) * 5000;
score -= Math.abs(1 - network.activate([1, 0])) * 5000;
score -= Math.abs(1 - network.activate([0, 1])) * 5000;
score -= Math.abs(0 - network.activate([1, 1])) * 5000;

有时它运行得很好。但我假设它只是运气好。但很多时候,它甚至达不到 -6000。它会经常在 -8000 左右徘徊。

这些是我的设置:

  GNN = new Evolution({
    size: 100,
    elitism: 10,
    mutationRate: 0.3,
    networkSize : [2,3,1],
    mutationMethod: [
      Methods.Mutate.MODIFY_RANDOM_BIAS,
      Methods.Mutate.MODIFY_RANDOM_WEIGHT,
      Methods.Mutate.SWAP_BIAS,
      Methods.Mutate.SWAP_WEIGHT,
      Methods.Mutate.MODIFY_SQUASH
    ],
    crossOverMethod: [
      Methods.Crossover.UNIFORM,
      Methods.Crossover.AVERAGE,
      Methods.Crossover.SINGLE_POINT,
      Methods.Crossover.TWO_POINT
    ],
    selectionMethod: [
      Methods.Selection.FITNESS_PROPORTIONATE
    ],
    generationMethod: [
      Methods.Generation.POINTS
    ],
    fitnessFunction: function(network){
      var score = 0;

      score -= Math.abs(0 - network.activate([0, 0])) * 5000;
      score -= Math.abs(1 - network.activate([1, 0])) * 5000;
      score -= Math.abs(1 - network.activate([0, 1])) * 5000;
      score -= Math.abs(0 - network.activate([1, 1])) * 5000;

      return Math.round(score);
    }
  });

(view the JSFiddle here and press train)

你建议我更改哪些设置? (请提供证明)

P.S。我知道通过反向传播训练异或要容易得多,但这只是出于实验目的。

我改了:

fitnessFunction: function(network){
  var score = 0;

  score -= Math.abs(0 - network.activate([0, 0])) * 5000;
  score -= Math.abs(1 - network.activate([1, 0])) * 5000;
  score -= Math.abs(1 - network.activate([0, 1])) * 5000;
  score -= Math.abs(0 - network.activate([1, 1])) * 5000;

  return Math.round(score);
}

fitnessFunction: function(network){
  var score = 0;

  score -= Methods.Cost.MSE([0], network.activate([0, 0])) * 5000;
  score -= Methods.Cost.MSE([1], network.activate([1, 0])) * 5000;
  score -= Methods.Cost.MSE([1], network.activate([0, 1])) * 5000;
  score -= Methods.Cost.MSE([0], network.activate([1, 1])) * 5000;

  return Math.round(score);
}

这在逻辑上是正确的,因为对误差求平方有助于增加完全错误的输出发生变化的可能性。 Read about it here