Differences

This shows you the differences between two versions of the page.

Link to this comparison view

composite [2017/08/03 09:48] (current)
Line 1: Line 1:
 +https://​docs.google.com/​a/​codeaudit.com/​document/​d/​19fs9A7LtOPDVwRL1nn9Y2UjrROBCf4bhir0zC8Iuyu8/​edit?​usp=sharing
 +
 +====== Composite Model Patterns======
 +
 +This chapter is an extension of the Model patterns chapter. ​ Here we focus our attention on composite model patterns. ​ That is characteristic structures that encompass a much broader scope than patterns in the Model patterns chapter. ​ We would like to examine the emergent properties that arise in a collection of models.
 +
 +<del>
 +A learning machine is trained by fitting a model to observed data.   The model in practice can consist of millions of parameters. ​ This implies that training will likely overfit the data and therefore lead to poor generalization. ​ Effective training thus requires technique lead to improved generalization by avoiding overfitting. ​  ​Models with good generalization are able to make accurate predictions on data that the machine has never observed before in training.  ​
 +
 +Ideally, one would prefer to have a machine to be able to explain at how it arrives at a conclusion. Unfortunately,​ ANNs are black boxes that have millions of uninterpretable parameters. ​  ​Despite this, ANNs have discovered to have several recurring characteristics in trained models. ​  One would hope that a trained model would coalesce into regions that reflect semantics, unfortunately studies have shown that the resulting models have random like characteristics. ​ Despite this randomness, researchers have found emergent structures. ​  ​Curiously enough, the model'​s parameters appear closer to random are found in machines that have very good generalization.
 +
 +Studying collections of models reveals a kind of duality between abstraction and consensus. ​ A model is able to capture abstract concepts however isolating those abstract concepts is usually difficult to do.  Models can also be  built up from consensus. ​ Models constructed thorough consensus leads to knowledge is diffused among many sub-models. ​ Knowledge diffusion is likely fractal in nature such that models can diffuse knowledge across many layers or across neurons in the same layer. ​ We shall see this as we explore this area that this recursiveness is common in many patterns. ​
 +
 +A machine at a high conceptual level has three properties. ​ These are depth, width and multiplicity. ​ Depth determines the layers of abstraction. ​ Width determines the diffusion of ensembles. ​ Multiplicity contributes to weighting of ensembles. ​
 +</​del>​
 +
 +One key question to determine if a pattern should be included in the Composite Model category is "Does the pattern affect behavior in both training and inference stages?" ​ An additional criteria is that Composite Models consist only of constructs that come from the Model pattern category. ​
 +
 +{{http://​main-alluviate.rhcloud.com/​wp-content/​uploads/​2016/​06/​composite.png}}
 +
 +[[Implicit Ensemble]] ​
 +
 +[[Weight Sharing]] (Tied Weights) related to implicit ensemble
 +
 +[[Layer Sharing]]
 +
 +[[Weight Quantization]]
 +
 +[[Layer Reversibility]] (Reversible Layer)
 +
 +[[Network in Network]] ​
 +
 +[[Residual]]
 +
 +[[Ladder]]
 +
 +[[Cardinality]]
 +
 +[[Variational Autoencoder]]
 +
 +[[Deep Neural Decision Tree]] (ALVT-200)
 +
 +[[Convolution Recurrent Network]]
 +
 +[[Correlational Network]]
 +
 +[[Transition Based]]
 +
 +[[Hourglass]]
 +
 +[[Prior Knowledge]]
 +
 +[[Iterative Inference]]
 +
 +**References**
 +
 +http://​arxiv.org/​pdf/​1511.02799v3.pdf
 +
 +Neural Module Networks
 +
 +In this paper, we have introduced neural module networks,
 +which provide a general-purpose framework for
 +learning collections of neural modules which can be dynamically
 +assembled into arbitrary deep networks. W