Differences

This shows you the differences between two versions of the page.

Link to this comparison view

mutable_layer [2016/07/20 21:36] (current)
Line 1: Line 1:
 +====== Mutable Layer ======
 +
 +**Aliases** Dynamic Layer
 +
 +**Intent**
 +
 +Networks where the activation functions can evolve during training.
 +
 +**Motivation**
 +
 +Networks conventionally have fixed activation functions. ​ Networks with simple activation function require less computational resources. ​ Complex activation functions require more resources, however have better resolving power. ​  How can we find the right balance the of selection of the activation functions?
 +
 +**Sketch**
 +
 +//This section provides alternative descriptions of the pattern in the form of an illustration or alternative formal expression. By looking at the sketch a reader may quickly understand the essence of the pattern. ​
 +
 +//
 +**Discussion** ​
 +
 +//This is the main section of the pattern that goes in greater detail to explain the pattern. We leverage a vocabulary that we describe in the theory section of this book. We don’t go into intense detail into providing proofs but rather reference the sources of the proofs. How the motivation is addressed is expounded upon in this section. We also include additional questions that may be interesting topics for future research.// ​
 +
 +
 +**Known Uses**
 +
 +//Here we review several projects or papers that have used this pattern.// ​
 +
 +
 +**Related Patterns**
 +//
 +In this section we describe in a diagram how this pattern is conceptually related to other patterns. The relationships may be as precise or may be fuzzy, so we provide further explanation into the nature of the relationship. We also describe other patterns may not be conceptually related but work well in combination with this pattern.//
 +
 +//​Relationship to Canonical Patterns//
 +
 +//​Relationship to other Patterns//
 +
 +**Further Reading**
 +
 +//We provide here some additional external material that will help in exploring this pattern in more detail.//
 +
 +
 +**References**
 +
 +//To aid in reading, we include sources that are referenced in the text in the pattern.//
 +
 +
 +http://​arxiv.org/​pdf/​1606.06216v1.pdf ​
 +Neural networks with differentiable structure
 +
 +While gradient descent has proven highly successful in learning connection weights
 +for neural networks, the actual structure of these networks is usually determined by
 +hand, or by other optimization algorithms. Here we describe a simple method to
 +make network structure differentiable,​ and therefore accessible to gradient descent.
 +
 +https://​arxiv.org/​pdf/​1511.06827.pdf ​ GRADNETS: DYNAMIC INTERPOLATION BETWEEN
 +NEURAL ARCHITECTURES
 +
 +Traditionally in deep learning,
 +one makes a static trade-off between the needs of early and late optimization. In
 +this paper, we investigate a novel framework, GradNets, for dynamically adapting
 +architectures during training to get the benefits of both. For example, we can
 +gradually transition from linear to non-linear networks, deterministic to stochastic
 +computation,​ shallow to deep architectures,​ or even simple downsampling to fully
 +differentiable attention mechanisms
 +
 +http://​arxiv.org/​pdf/​1606.07326v1.pdf DropNeuron: Simplifying the Structure of Deep
 +Neural Networks
 +
 +We proposed regularisers which support a simple mechanism
 +of dropping neurons during a network training process.