When we first are introduced to deep learning, we see it as a better machine learning classifier. Alternatively, we could subscribe to the hype that it is 'brain-like' neuro-computing. In the former instance, we grossly underestimate the kinds of applications we can build with this. In the later instance, we grossly overestimate its capabilities and as a consequence overlook the kind of applications that are not general artificial intelligence, but applications that are more realistic and pragmatic.

It is best to look at applications of deep learning from the perspective of improving human computer interaction. This is perhaps the most natural approach. Deep learning systems do appear to have capabilities that approximate the capabilities of biological brains. As such, they can be most effectively used to augmenting tasks that humans or even animals have been employed to perform. It is important to remember that deep learning systems are very different from traditional symbolic computing platforms. Just as humans think very different from how a computer computes, deep learning similarly different.

However, deep learning systems are already intrinsically built from traditional computational technology. So that the tireless mechanistic efficiency inherent in computers are also present with deep learning. Computers are much more capable than humans in performing accurate symbolic computation and inference. Though, deep learning systems are not yet capable of performing complex symbolic computation. They are however by default already linked to this capability.

Applications built using deep learning seems to be straight out of science fiction. Here is a partial sample of some of the incredible applications that have been developed so far:

Photo Captioning for the Blind

Facebook has developed a mobile app that is able to describe a photograph to people who are blind.

Realtime Speech Translation

Microsoft Skype is able to translate voice into different languages in realtime. Something straight out of the universal translator in Star Trek.

Automated Email Replies

Google Mail is able to automatically respond to email on your behalf.

Object Identification

Moodstocks (acquired by Google) is able to identify common objects using your mobile phone.

Location Identification from Photographs

Google is able to identify the location of where a photograph is taken just my analyzing the scene.

Organizing Collections of Photographs

Google Photos is able to autmatically organize your photographs into collections with common shared themes.

Classifying Photographs

Yelp is able to automatically classify photographs into different business relevant categories.

Self-Driving Cars

A hobbyist is able to teach his car to self-drive in a few hours. End to End Learning for Self-Driving Cars

We have empirically demonstrated that CNNs are able to learn the entire task of lane and road following without manual decomposition into road or lane marking detection, semantic abstraction, path planning, and control. A small amount of training data from less than a hundred hours of driving was sufficient to train the car to operate in diverse conditions, on highways, local and residential roads in sunny, cloudy, and rainy conditions. The CNN is able to learn meaningful road features from a very sparse training signal (steering alone). The system learns for example to detect the outline of a road without the need of explicit labels during training.

Music Composition

Music can be composed based on different composer styles.

Painting based on Artists Styles

Painting can be created based on famous artist painting styles.

Discovery of New Materials

New materials are discovered with the help of deep learning.

Playing Video Games

Google DeepMind is able to create video game playing systems that learn how to play well by just watching the game.

Playing Championship Level Go

Google DeepMind has created a Go playing system that is able to learn new strategies by playing against itself.

Face Identification

Face recognition is so common that it is no longer surprising.

Click-bait Headline Generation

A RNN is trained to generate click-bait headlines.

Colorization of Black and White Photographs

A system is trained to convert black and white photographs into color. is a service that lets you try this out on your own photos!

Realtime Translation of Images of Text

Google has a mobile app that translates the text found in a photo into text that you can understand.

Predictive Keyboards

Swiftkey is building keyboards for mobile phones that make it easier and faster for you to type.

Predict the Future ;-)

Well, that's the claim by these folks at MIT:

3D Object Classification

Gesture Recognition

Learning the meaning of different hand gestures is likely going to be how we interact with devices that don't have screens.

Deep Learning for Electromyographic Hand Gesture Signal Classification by Leveraging Transfer Learning

Converting Photos of People to make them Smile SmileVector is able to take an image of an image of a person and transform it into an image of the person smiling.

Human Like Conversation

Google has create a messaging application that has more natural conversational capabilities..

Augmented Reality - Face Tracking Baidu created a mobile app that is able to track faces using Deep Learning. The app overlays a 3D image over one's face.

Warehouse Optimization A Deep Learning system is trained to learn an optimal way of pick and placing items in a warehouse. This system is faster than the more traditional operation research optimization approach.

Sketch to Search Sketch an image as a query to a visual search.

Prosetheses Control

EEG-informed attended speaker extraction from recorded speech mixtures with application in neuro-steered hearing prostheses.

Accelerating Fluid Simulation

Leveraging convolution networks to create fast and highly realistic fluid simulations.


Amazon drives its personalization capabilities using Deep Learning.

Brain Tumor Detection

Results reported on the 2013 BRATS test dataset reveal that the 802,368 parameter network improves over published state-of-the-art and is over 30 times faster.

Reducing your Electric Bill

Google is using technology from the DeepMind artificial intelligence subsidiary for big savings on the power consumed by its data centers.

Stocking Shelves

Amazon sponsored researchers used deep learning to analyze 3D scans of objects that their robot had to pick and replace.

Mapping Streets

Facebook is using Deep Learning to create more accurate and current maps from satellite imagery.

Voice Printing

Identifying people through their voice.

Infrared Colorization

Users may more quickly and accurately comprehend infrared images that have been colorized.

3D Design

Taking a 3D voxel representation of a shape and a semantic deformation intention (e.g., make more sporty) as input and then generate a deformation flow at the output.

Sketch to Generate Realistically Photos

Convert face sketches to synthesize photorealistic face images.

Predicting Clinical Events

A RNN trained on time stamped EHR data from 260 thousand patients and 14,805 physicians over 8 years. The network is able to make multilabel predictions (one label for each diagnosis or medication category). The system can perform differential diagnosis with up to 79% recall, significantly higher than several baselines.

Skin Evaluation and Recommendation

Using Deep Learning to determine a customer’s “skin age,” identify problem areas and offer a regimen of products meant to address those issues.


Drug design, virtual screening (VS), Quantitative Structure–Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining.


Reducing Risk in Agriculture due to Climate Change

Mapping Poverty using Satellite Data

Discover New Compression Algorithms Full Resolution Image Compression with Recurrent Neural Networks

This is the first neural network architecture that is able to outperform JPEG at image compression across most bitrates on the rate-distortion curve on the Kodak dataset images, with and without the aid of entropy coding. Writing a pop song.

Transfiguring Portraits

Place your face into another portrait.

Speech Synthesis

Blur Out Background in Photographs

Predicting Corporate Bankruptcies

YouTube Recommendations

Sorting Cucumbers

Reducing Traffic

Reverse Engineering Biological Processes

Realtime Facial Transfer

Research at Stanford shows how you can transfer your expressions into someone else's face. This is not a deep learning application, but I would not be surprised if a deep learning system could do something similar. realtime facial transfer. Not realtime, but using Deep Learning:

Fast Face-swap Using Convolutional Neural Networks

Swap Nicholas Cage and Taylor Swift into another person's face.

Virtual Assistant

Analysis of Disaster Damage

Realtime Conversational Assistance

Detect Fashionable Clothing

Baby Sleep Monitor

Voice Conversion Voice Conversion using Convolutional Neural Networks

Music Genre Classification

Photorealistic Facial Texture Inference

Music Classification


Image Editing Using Deep Learning and Google Street View to Estimate the Demographic Makeup of the US

Story Points (Task Estimation)

Other Vision Applications

Scene Text Erase

Visual Product Discovery

Spatial-Temporal Recurrent Neural Network for Emotion Recognition

Facial Animation

Crowdturfing Automated Crowdturfing Attacks and Defenses in Online Review Systems

Watermark Removal

The Conditional Analogy GAN: Swapping Fashion Articles on People Images


See Behind Walls

Neural network identification of people hidden from view with a single-pixel, single-photon detector

Chemical Synthesis

Learning to Plan Chemical Syntheses

Smart Mirror Makeup

Reading Text in the Wild THE AUTOMATION OF THE “TECHNICAL” PART OF ART: THE USE OF ARTIFICIAL INTELLIGENCE IN THE ARTISTIC CREATION Sketch Interpretation and Refinement Using Statistical Models NIMA: Neural Image Assessment DeepHeart: Semi-Supervised Sequence Learning for Cardiovascular Risk Prediction END-TO-END DIFFERENTIABLE LEARNING OF PROTEIN STRUCTURE Voice cloning

Watermark removal Logo Detection

Caricature Drawing

Lung cancer

Assistive Creativity

Generative Creativity FontCode: Embedding Information in Text Documents using Glyph Perturbation