Recurrent Ladder Networks

We propose a recurrent extension of the Ladder network [24], which is motivated by the inference required in hierarchical latent variable models. We demonstrate that the recurrent Ladder is able to handle a wide variety of complex learning tasks that benefit from iterative inference and temporal modeling. LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation

In summary, the free-energy formulation dispenses with value functions and prescribes optimal trajectories in terms of prior expectations. Active inference ensures these trajectories are followed, even under random perturbations. In what sense are priors optimal? They are optimal in the sense that they restrict the states of an agent to a small part of state-space. In this formulation, rewards do not attract trajectories; rewards are just sensory states that are visited frequently. If we want to change the behaviour of an agent in a social or experimental setting, we simply induce new (empirical) priors by exposing the agent to a new environment. From the engineering perceptive, the ensuing behaviour is remarkably robust to noise and limited only by the specification of the new (controlled) environment. A Minimal Active Inference Agent Deep temporal models and active inference Active inference and learning IDK Cascades: Fast Deep Learning by Learning not to Overthink

We introduce the “I Don't Know” (IDK) prediction cascades framework, a general framework for composing a set of pre-trained models to accelerate inference without a loss in prediction accuracy. We propose two search based methods for constructing cascades as well as a new cost-aware objective within this framework. We evaluate these techniques on a range of both benchmark and real-world datasets and demonstrate that prediction cascades can reduce computation by 37%, resulting in up to 1.6x speedups in image classification tasks over state-of-the-art models without a loss in accuracy. DYNAMIC EVALUATION OF NEURAL SEQUENCE MODELS

Dynamic evaluation methods continuously adapt the model parameters θg, learned at training time, to parts of a sequence during evaluation. Recurrent Inference Machines for Solving Inverse Problems

Much of the recent research on solving iterative inference problems focuses on moving away from hand-chosen inference algorithms and towards learned inference. In the latter, the inference process is unrolled in time and interpreted as a recurrent neural network (RNN) which allows for joint learning of model and inference parameters with back-propagation through time. In this framework, the RNN architecture is directly derived from a hand-chosen inference algorithm, effectively limiting its capabilities. We propose a learning framework, called Recurrent Inference Machines (RIM), in which we turn algorithm construction the other way round: Given data and a task, train an RNN to learn an inference algorithm. Because RNNs are Turing complete [1, 2] they are capable to implement any inference algorithm. The framework allows for an abstraction which removes the need for domain knowledge. We demonstrate in several image restoration experiments that this abstraction is effective, allowing us to achieve state-of-the-art performance on image denoising and super-resolution tasks and superior across-task generalization. Deep Predictive Coding Network for Object Recognition

PCN reuses a single architecture to recursively run bottom-up and top-down process, enabling an increasingly longer cascade of non-linear transformation. For image classification, PCN refines its representation over time towards more accurate and definitive recognition. Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement Iterative Visual Reasoning Beyond Convolutions

The framework consists of two core modules: a local module that uses spatial memory to store previous beliefs with parallel updates; and a global graph-reasoning module. Our graph module has three components: a) a knowledge graph where we represent classes as nodes and build edges to encode different types of semantic relationships between them; b) a region graph of the current image where regions in the image are nodes and spatial relationships between these regions are edges; c) an assignment graph that assigns regions to classes. Both the local module and the global module roll-out iteratively and cross-feed predictions to each other to refine estimates. The final predictions are made by combining the best of both modules with an attention mechanism. We show strong performance over plain ConvNets, \eg achieving an 8.4% absolute improvement on ADE measured by per-class average precision. Analysis also shows that the framework is resilient to missing regions for reasoning. Meta-learning with differentiable closed-form solvers

In this work we propose to use these fast convergent methods as the main adaptation mechanism for few-shot learning. The main idea is to teach a deep network to use standard machine learning tools, such as logistic regression, as part of its own internal model, enabling it to quickly adapt to novel tasks. This requires back-propagating errors through the solver steps. Iterative Amortized Inference Recurrent Inference Machines for Solving Inverse Problems

We establish this framework by abandoning the traditional separation between model and inference. Instead, we propose to learn both components jointly without the need to define their explicit functional form. This paradigm shift enables us to bridge the gap between the fields of deep learning and inverse problems. A crucial and unique quality of RIMs are their ability to generalize across tasks without the need to retrain. We convincingly demonstrate this feature in our experiments as well as state of the art results on image denoising and super-resolution. Concept Learning with Energy-Based Models Generating Multi-Agent Trajectories using Programmatic Weak Supervision

e blend deep generative models with programmatic weak supervision to generate coordinated multi-agent trajectories of significantly higher quality than previous baselines.