This is an old revision of the document!


http://www.cs.toronto.edu/~fritz/absps/transauto6.pdf Transforming Auto-encoders

https://www.youtube.com/watch?v=TFIMqt0yT2I Geoffrey Hinton: “Does the Brain do Inverse Graphics?”

http://cseweb.ucsd.edu/~gary/cs200/s12/Hinton.pdf Does the Brain do Inverse Graphics?

https://arxiv.org/abs/1503.03167 Deep Convolutional Inverse Graphics Network

http://willwhitney.com/dc-ign/www/ https://github.com/willwhitney/dc-ign

https://www.cs.toronto.edu/~hinton/csc2535/notes/lec6b.pdf Taking Inverse Graphics Seriously

https://arxiv.org/pdf/1406.6901.pdf

https://www.vicarious.com/img/icml2017-schemas.pdf Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics

In pursuit of efficient and robust generalization, we introduce the Schema Network, an objectoriented generative physics simulator capable of disentangling multiple causes of events and reasoning backward through causes to achieve goals. The richly structured architecture of the Schema Network can learn the dynamics of an environment directly from data. We compare Schema Networks with Asynchronous Advantage Actor-Critic and Progressive Networks on a suite of Breakout variations, reporting results on training efficiency and zero-shot generalization, consistently demonstrating faster, more robust learning and better transfer.

https://arxiv.org/abs/1801.05091v1 Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis

We propose a novel hierarchical approach for text-to-image synthesis by inferring semantic layout. Instead of learning a direct mapping from text to image, our algorithm decomposes the generation process into multiple steps, in which it first constructs a semantic layout from the text by the layout generator and converts the layout to an image by the image generator. The proposed layout generator progressively constructs a semantic layout in a coarse-to-fine manner by generating object bounding boxes and refining each box by estimating object shapes inside the box. The image generator synthesizes an image conditioned on the inferred semantic layout, which provides a useful semantic structure of an image matching with the text description. Our model not only generates semantically more meaningful images, but also allows automatic annotation of generated images and user-controlled generation process by modifying the generated scene layout. We demonstrate the capability of the proposed model on challenging MS-COCO dataset and show that the model can substantially improve the image quality, interpretability of output and semantic alignment to input text over existing approaches.

https://arxiv.org/abs/1801.09597v1 Deep Reinforcement Learning using Capsules in Advanced Game Environments

This thesis introduces the use of CapsNet for Q-Learning based game algorithms. To successfully apply CapsNet in advanced game play, three main contributions follow. First, the introduction of four new game environments as frameworks for RL research with increasing complexity, namely Flash RL, Deep Line Wars, Deep RTS, and Deep Maze. These environments fill the gap between relatively simple and more complex game environments available for RL research and are in the thesis used to test and explore the CapsNet behavior. Second, the thesis introduces a generative modeling approach to produce artificial training data for use in Deep Learning models including CapsNets. We empirically show that conditional generative modeling can successfully generate game data of sufficient quality to train a Deep Q-Network well. Third, we show that CapsNet is a reliable architecture for Deep Q-Learning based algorithms for game AI. A capsule is a group of neurons that determine the presence of objects in the data and is in the literature shown to increase the robustness of training and predictions while lowering the amount training data needed. It should, therefore, be ideally suited for game plays.

https://arxiv.org/pdf/1805.03551v2.pdf A Unified Framework of Deep Neural Networks by Capsules This capsule framework could not only simplify the description of existing DNNs, but also provide a theoretical basis of graphical designing and programming for new deep learning models. As future work, we will try to define an industrial standard and implement a graphic platform for the advancement of deep learning with capsule networks, and even with a similar extension to recurrent neural networks.

https://arxiv.org/pdf/1804.10172.pdf Capsule networks for low-data transfer learning

The generative capsule network uses what we call a memo architecture, which consists of convolving the images into the Digit Capsules, applying convolutional reconstruction, and classifying images based on the reconstruction

https://arxiv.org/abs/1805.08090v1 Graph Capsule Convolutional Neural Networks

https://arxiv.org/abs/1805.07242 Siamese Capsule Networks

https://github.com/yash-1995-2006/Conditional-and-nonConditional-Capsule-GANs/ Conditional-and-nonConditional-Capsule-GANs

https://github.com/XifengGuo/CapsNet-Keras

https://arxiv.org/abs/1810.05315v1 A Context-aware Capsule Network for Multi-label Classification

We introduce, (1) a novel routing weight initialization technique, (2) an improved CapsNet design that exploits semantic relationships between the primary capsule activations using a densely connected Conditional Random Field and (3) a Cholesky transformation based correlation module to learn a general priority scheme. Our proposed design allows CapsNet to scale better to more complex problems, such as the multi-label classification task, where semantically related categories co-exist with various interdependencies.

https://arxiv.org/abs/1811.06969v1 DARCCC: Detecting Adversaries by Reconstruction from Class Conditional Capsules