This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Last revision Both sides next revision
associative_memory [2017/09/25 18:06] external edit
associative_memory [2018/05/30 21:58]
Line 35: Line 35:
 in Generative Models in Generative Models
 +https://​arxiv.org/​abs/​1610.08613v2 Can Active Memory Replace Attention?
 +We propose an extended model of active memory that matches existing attention models on neural machine translation and generalizes better to longer sentences. We investigate this model and explain why previous active memory models did not succeed. Finally, we discuss when active memory brings most benefits and where attention can be a better choice.
 +https://​arxiv.org/​pdf/​1804.01756.pdf The Kanerva Machine: A Generative Distributed Memory
 +We present an end-to-end trained memory system that quickly adapts to new data and generates samples like them. Inspired by Kanerva'​s sparse distributed memory, it has a robust distributed reading and writing mechanism. The memory is analytically tractable, which enables optimal on-line compression via a Bayesian update-rule. We formulate it as a hierarchical conditional generative model, where memory provides a rich data-dependent prior distribution. Consequently,​ the top-down memory and bottom-up perception are combined to produce the code representing an observation. Empirically,​ we demonstrate that the adaptive memory significantly improves generative models trained on both the Omniglot and CIFAR datasets. Compared with the Differentiable Neural Computer (DNC) and its variants, our memory model has greater capacity and is significantly easier to train.
 +http://​www.rctn.org/​vs265/​kanerva09-hyperdimensional.pdf ​
 +https://​github.com/​jgpavez/​Working-Memory-Networks The Working Memory Network is a Memory Network architecture with a novel working memory storage and relational reasoning module. The model retains the relational reasoning abilities of the Relation Network while reducing its computational complexity considerably. The model achieves state-of-the-art performance in the jointly trained bAbI-10k dataset, with an average error of less than 0.5%.