Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
structured_factorization [2018/04/23 18:37]
admin
structured_factorization [2018/10/27 12:07] (current)
admin
Line 335: Line 335:
  
 In this paper, we study the dimensionality of the learned representations by models that have proved highly succesful for image classification. We focus on ResNet-18, ResNet-50 and VGG-19 and observe that when trained on CIFAR10 or CIFAR100 datasets, the learned representations exhibit a fairly low rank structure. We propose a modification to the training procedure, which further encourages low rank representations of activations at various stages in the neural network. Empirically,​ we show that this has implications for compression and robustness to adversarial examples. In this paper, we study the dimensionality of the learned representations by models that have proved highly succesful for image classification. We focus on ResNet-18, ResNet-50 and VGG-19 and observe that when trained on CIFAR10 or CIFAR100 datasets, the learned representations exhibit a fairly low rank structure. We propose a modification to the training procedure, which further encourages low rank representations of activations at various stages in the neural network. Empirically,​ we show that this has implications for compression and robustness to adversarial examples.
 +
 +https://​papers.nips.cc/​paper/​3904-guaranteed-rank-minimization-via-singular-value-projection.pdf ​
 +
 +https://​arxiv.org/​abs/​1805.04582v1 TensOrMachine:​ Probabilistic Boolean Tensor Decomposition
 +
 +https://​arxiv.org/​abs/​0909.4061 Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions
 +
 +https://​arxiv.org/​abs/​1802.05983v2 Disentangling by Factorising
 +
 +https://​arxiv.org/​pdf/​1810.10531.pdf A mathematical theory of semantic development
 +in deep neural networks
 +
 +The synaptic weights of the neural
 +network extract from the statistical structure of the environment
 +a set of paired object analyzers and feature synthesizers associated
 +with every categorical distinction. The bootstrapped,​ simultaneous
 +learning of each pair solves the apparent Gordian knot of knowing
 +both which items belong to a category, and which features are important
 +for that category: the object analyzers determine category
 +membership, while the feature synthesizers determine feature importance,
 +and the set of extracted categories are uniquely determined
 +by the statistics of the environment.