Differences

This shows you the differences between two versions of the page.

Link to this comparison view

gain [2016/07/10 01:31] (current)
Line 1: Line 1:
 +**Name** Gain
  
 +**Intent**
 +
 +
 +**Motivation**
 +
 +
 +**Sketch**
 +
 +<​Diagram>​
 +
 +**Discussion**
 +
 +
 +
 +**Known Uses**
 +
 +**Related Patterns**
 +
 +<​Diagram>​
 +
 +**References**
 +
 +
 +**References**
 +
 +https://​www.quora.com/​Do-all-loss-functions-suffer-from-the-vanishing-gradient-problem-in-neural-networks
 +
 +http://​www.jmlr.org/​proceedings/​papers/​v28/​pascanu13.pdf On the difficulty of training recurrent neural networks
 +
 +http://​arxiv.org/​abs/​1604.02313v3 ​ Norm-preserving Orthogonal Permutation Linear Unit Activation Functions (OPLU)
 +
 +We propose a novel activation function that implements piece-wise orthogonal non-linear mappings based on permutations. It is straightforward to implement, and very computationally efficient, also it has little memory requirements. We tested it on two toy problems for feedforward and recurrent networks, it shows similar performance to tanh and ReLU. OPLU activation function ensures norm preservance of the backpropagated gradients, therefore it is potentially good for the training of deep, extra deep, and recurrent neural networks.