Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
forgetting [2017/11/30 22:11]
admin
forgetting [2019/01/16 12:20] (current)
admin
Line 196: Line 196:
 INCREMENTAL INCREMENTAL
 LEARNING LEARNING
 +
 +We proposed a brain-inspired framework capable of incrementally learning data with
 +different modalities and object classes. FearNet outperforms existing methods for incremental class
 +learning on large image and audio classification benchmarks, demonstrating that FearNet is capable
 +of recalling and consolidating recently learned information while also retaining old information.
 +
 +https://​www.technologyreview.com/​s/​609710/​neural-networks-are-learning-what-to-remember-and-what-to-forget/​
 +
 +https://​arxiv.org/​abs/​1711.09601v1 Memory Aware Synapses: Learning what (not) to forget
 +
 + ​Inspired by neuroplasticity,​ we propose an online method to compute the importance of the parameters of a neural network, based on the data that the network is actively applied to, in an unsupervised manner. After learning a task, whenever a sample is fed to the network, we accumulate an importance measure for each parameter of the network, based on how sensitive the predicted output is to a change in this parameter. When learning a new task, changes to important parameters are penalized. We show that a local version of our method is a direct application of Hebb's rule in identifying the important connections between neurons. ​
 +
 +https://​arxiv.org/​pdf/​1801.01423v1.pdf Overcoming Catastrophic Forgetting with Hard Attention to the Task
 +
 +https://​openreview.net/​forum?​id=rkfOvGbCW Memory-based Parameter Adaptation ​
 +
 +Our method, Memory-based Parameter Adaptation, stores examples in memory and then uses a context-based lookup to directly modify the weights of a neural network. Much higher learning rates can be used for this local adaptation, reneging the need for many iterations over similar data before good predictions can be made. As our method is memory-based,​ it alleviates several shortcomings of neural networks, such as catastrophic forgetting, fast, stable acquisition of new knowledge, learning with an imbalanced class labels, and fast learning during evaluation. We demonstrate this on a range of supervised tasks: large-scale image classification and language modelling.
 +
 +https://​arxiv.org/​abs/​1711.05769 PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning
 +
 +https://​arxiv.org/​abs/​1707.01429v1 Theory of the superposition principle for randomized connectionist representations in neural networks
 +
 +https://​arxiv.org/​abs/​1712.07136 Low-Shot Learning with Imprinted Weights
 +
 +by directly setting the final layer weights from novel training examples during low-shot learning. We call this process weight imprinting as it directly sets weights for a new category based on an appropriately scaled copy of the embedding layer activations for that training example.
 +
 +https://​github.com/​facebookresearch/​GradientEpisodicMemory Gradient Episodic Memory for Continual Learning
 +
 +https://​github.com/​jaehong-yoon93/​DEN Lifelong Learning with Dynamically Expandable Networks
 +
 +https://​arxiv.org/​pdf/​1808.06508.pdf Life-Long Disentangled Representation
 +Learning with Cross-Domain Latent Homologies ​ https://​deepmind.com/​blog/​imagine-creating-new-visual-concepts-recombining-familiar-ones/​
 +
 +: Variational Autoencoder with Shared Embeddings
 +(VASE). Based on the Minimum Description Length principle, VASE automatically
 +detects shifts in the data distribution and allocates spare representational capacity to
 +new knowledge, while simultaneously protecting previously learnt representations
 +from catastrophic forgetting. Our approach encourages the learnt representations
 +to be disentangled,​ which imparts a number of desirable properties: VASE can
 +deal sensibly with ambiguous inputs, it can enhance its own representations through
 +imagination-based exploration,​ and most importantly,​ it exhibits semantically
 +meaningful sharing of latents between different datasets. Compared to baselines
 +with entangled representations,​ our approach is able to reason beyond surface-level
 +statistics and perform semantically meaningful cross-domain inference.
 +
 +
 +https://​arxiv.org/​abs/​1705.09847 Lifelong Generative Modeling
 +n this work we focus on a lifelong learning approach to generative modeling where we continuously incorporate newly observed distributions into our learnt model. We do so through a student-teacher Variational Autoencoder architecture which allows us to learn and preserve all the distributions seen so far without the need to retain the past data nor the past models. Through the introduction of a novel cross-model regularizer,​ inspired by a Bayesian update rule, the student model leverages the information learnt by the teacher, which acts as a summary of everything seen till now. The regularizer has the additional benefit of reducing the effect of catastrophic interference that appears when we learn over sequences of distributions. We demonstrate its efficacy in learning sequentially observed distributions as well as its ability to learn a common latent representation across a complex transfer learning scenario.
 +
 +https://​arxiv.org/​abs/​1804.00218v1 Synthesis of Differentiable Functional Programs for Lifelong Learning
 +
 +https://​arxiv.org/​abs/​1809.02058v1 Memory Replay GANs: learning to generate images from new categories without forgetting
 +
 +We study two methods to prevent forgetting by leveraging these replays, namely joint training with replay and replay alignment. Qualitative and quantitative experimental results in MNIST, SVHN and LSUN datasets show that our memory replay approach can generate competitive images while significantly mitigating the forgetting of previous categories
 +
 +https://​arxiv.org/​abs/​1711.09601 Memory Aware Synapses: Learning what (not) to forget
 +
 +https://​openreview.net/​forum?​id=H1lIzhC9FX Learning to remember: Dynamic Generative Memory for Continual Learning ​
 +
 +https://​openreview.net/​forum?​id=rJgz8sA5F7 HC-Net: Memory-based Incremental Dual-Network System for Continual learning ​
 +
 +https://​openreview.net/​forum?​id=BkloRs0qK7 A comprehensive,​ application-oriented study of catastrophic forgetting in DNNs
 +
 +https://​openreview.net/​forum?​id=ryGvcoA5YX Overcoming Catastrophic Forgetting via Model Adaptation ​
 +
 +http://​proceedings.mlr.press/​v80/​miconi18a.html Differentiable plasticity: training plastic neural networks with backpropagation
 +