Applications

When we first are introduced to deep learning, we see it as a better machine learning classifier. Alternatively, we could subscribe to the hype that it is 'brain-like' neuro-computing. In the former instance, we grossly underestimate the kinds of applications we can build with this. In the later instance, we grossly overestimate its capabilities and as a consequence overlook the kind of applications that are not general artificial intelligence, but applications that are more realistic and pragmatic.

It is best to look at applications of deep learning from the perspective of improving human computer interaction. This is perhaps the most natural approach. Deep learning systems do appear to have capabilities that approximate the capabilities of biological brains. As such, they can be most effectively used to augmenting tasks that humans or even animals have been employed to perform. It is important to remember that deep learning systems are very different from traditional symbolic computing platforms. Just as humans think very different from how a computer computes, deep learning similarly different.

However, deep learning systems are already intrinsically built from traditional computational technology. So that the tireless mechanistic efficiency inherent in computers are also present with deep learning. Computers are much more capable than humans in performing accurate symbolic computation and inference. Though, deep learning systems are not yet capable of performing complex symbolic computation. They are however by default already linked to this capability.

Applications built using deep learning seems to be straight out of science fiction. Here is a partial sample of some of the incredible applications that have been developed so far:

Photo Captioning for the Blind

Facebook has developed a mobile app that is able to describe a photograph to people who are blind. http://www.wired.com/2015/10/facebook-artificial-intelligence-describes-photo-captions-for-blind-people/

Realtime Speech Translation

Microsoft Skype is able to translate voice into different languages in realtime. Something straight out of the universal translator in Star Trek. http://blogs.skype.com/2014/12/15/skype-translator-how-it-works/

Automated Email Replies

Google Mail is able to automatically respond to email on your behalf. http://www.wired.com/2015/11/google-is-using-ai-to-create-automatic-replies-in-gmail/

Object Identification

Moodstocks (acquired by Google) is able to identify common objects using your mobile phone. http://www.slideshare.net/CdricDeltheil1/moodstocks-mobile-image-recognition-paris-tech-talks-6

Location Identification from Photographs

Google is able to identify the location of where a photograph is taken just my analyzing the scene. https://www.technologyreview.com/s/600889/google-unveils-neural-network-with-superhuman-ability-to-determine-the-location-of-almost/

Organizing Collections of Photographs

Google Photos is able to autmatically organize your photographs into collections with common shared themes. https://www.youtube.com/watch?v=JuFtW1PSYAU

Classifying Photographs

Yelp is able to automatically classify photographs into different business relevant categories. http://engineeringblog.yelp.com/2015/10/how-we-use-deep-learning-to-classify-business-photos-at-yelp.html

Self-Driving Cars

A hobbyist is able to teach his car to self-drive in a few hours. http://www.bloomberg.com/features/2015-george-hotz-self-driving-car/

https://arxiv.org/pdf/1604.07316v1.pdf End to End Learning for Self-Driving Cars

We have empirically demonstrated that CNNs are able to learn the entire task of lane and road following without manual decomposition into road or lane marking detection, semantic abstraction, path planning, and control. A small amount of training data from less than a hundred hours of driving was sufficient to train the car to operate in diverse conditions, on highways, local and residential roads in sunny, cloudy, and rainy conditions. The CNN is able to learn meaningful road features from a very sparse training signal (steering alone). The system learns for example to detect the outline of a road without the need of explicit labels during training.

Music Composition

Music can be composed based on different composer styles. http://web.mit.edu/felixsun/www/neural-music.html

Painting based on Artists Styles

Painting can be created based on famous artist painting styles. https://nucl.ai/blog/neural-doodles/

Discovery of New Materials

New materials are discovered with the help of deep learning. http://www.nature.com/articles/srep02810

Playing Video Games

Google DeepMind is able to create video game playing systems that learn how to play well by just watching the game. http://www.wired.co.uk/article/google-deepmind-atari

Playing Championship Level Go

Google DeepMind has created a Go playing system that is able to learn new strategies by playing against itself. http://www.scientificamerican.com/article/how-the-computer-beat-the-go-master/

Face Identification

Face recognition is so common that it is no longer surprising.

https://cmusatyalab.github.io/openface/

http://gitxiv.com/posts/fDJ7nHHou57aLEjBQ/the-megaface-benchmark-1-million-faces-for-recognition-at

Click-bait Headline Generation

A RNN is trained to generate click-bait headlines.

https://larseidnes.com/2015/10/13/auto-generating-clickbait-with-recurrent-neural-networks/

Colorization of Black and White Photographs

A system is trained to convert black and white photographs into color. http://richzhang.github.io/colorization/

http://demos.algorithmia.com/colorize-photos/ is a service that lets you try this out on your own photos!

Realtime Translation of Images of Text

Google has a mobile app that translates the text found in a photo into text that you can understand.

https://research.googleblog.com/2015/07/how-google-translate-squeezes-deep.html

Predictive Keyboards

Swiftkey is building keyboards for mobile phones that make it easier and faster for you to type. http://www.slashgear.com/swiftkey-neural-alpha-predicts-what-youll-type-08408912/

Predict the Future ;-)

Well, that's the claim by these folks at MIT: http://news.mit.edu/2016/teaching-machines-to-predict-the-future-0621

3D Object Classification

http://3dshapenets.cs.princeton.edu/

Gesture Recognition

Learning the meaning of different hand gestures is likely going to be how we interact with devices that don't have screens.

https://engineering.purdue.edu/cdesign/wp/deephand-robust-hand-pose-estimation/ https://atap.google.com/soli/

Deep Learning for Electromyographic Hand Gesture Signal Classification by Leveraging Transfer Learning

https://arxiv.org/abs/1801.07756

Converting Photos of People to make them Smile SmileVector is able to take an image of an image of a person and transform it into an image of the person smiling.

https://www.engadget.com/2016/06/27/twitter-bot-plasters-creepy-smiles-on-celebrities-faces/

Human Like Conversation

Google has create a messaging application that has more natural conversational capabilities.. https://research.googleblog.com/2016/05/chat-smarter-with-allo.html

Augmented Reality - Face Tracking Baidu created a mobile app that is able to track faces using Deep Learning. The app overlays a 3D image over one's face.

http://research.baidu.com/happy-halloween-baidu-research-introduces-faceyou/ https://www.technologyreview.com/s/602091/baidu-is-bringing-intelligent-ar-to-the-masses/

Warehouse Optimization A Deep Learning system is trained to learn an optimal way of pick and placing items in a warehouse. This system is faster than the more traditional operation research optimization approach.

https://devblogs.nvidia.com/parallelforall/optimizing-warehouse-operations-machine-learning-gpus/

Sketch to Search Sketch an image as a query to a visual search.

https://news.developer.nvidia.com/using-sketches-to-search-for-products-online

Prosetheses Control

http://arxiv.org/pdf/1602.05702v3.pdf

EEG-informed attended speaker extraction from recorded speech mixtures with application in neuro-steered hearing prostheses.

Accelerating Fluid Simulation

http://cims.nyu.edu/~schlacht/CNNFluids.htm

Leveraging convolution networks to create fast and highly realistic fluid simulations.

Personalization

Amazon drives its personalization capabilities using Deep Learning. http://blogs.aws.amazon.com/bigdata/post/TxGEL8IJ0CAXTK/Generating-Recommendations-at-Amazon-Scale-with-Apache-Spark-and-Amazon-DSSTNE

Brain Tumor Detection

Results reported on the 2013 BRATS test dataset reveal that the 802,368 parameter network improves over published state-of-the-art and is over 30 times faster.

https://arxiv.org/abs/1505.03540

Reducing your Electric Bill

Google is using technology from the DeepMind artificial intelligence subsidiary for big savings on the power consumed by its data centers.

http://www.bloomberg.com/news/articles/2016-07-19/google-cuts-its-giant-electricity-bill-with-deepmind-powered-ai

Stocking Shelves

Amazon sponsored researchers used deep learning to analyze 3D scans of objects that their robot had to pick and replace.

http://www.theverge.com/2016/7/5/12095788/amazon-picking-robot-challenge-2016

Mapping Streets

Facebook is using Deep Learning to create more accurate and current maps from satellite imagery.

http://forum.openstreetmap.org/viewtopic.php?id=55220

Voice Printing

Identifying people through their voice.

https://www.technologyreview.com/s/537101/deep-learning-machine-solves-the-cocktail-party-problem/

Infrared Colorization

Users may more quickly and accurately comprehend infrared images that have been colorized.

http://arxiv.org/abs/1604.02245v3

3D Design

Taking a 3D voxel representation of a shape and a semantic deformation intention (e.g., make more sporty) as input and then generate a deformation flow at the output.

http://www.creativeai.net/posts/CjrYHppotnFXbeWW8/learning-semantic-deformation-flows-with-3d-convolutional

Sketch to Generate Realistically Photos

Convert face sketches to synthesize photorealistic face images.

https://arxiv.org/pdf/1606.03073v1.pdf

Predicting Clinical Events

A RNN trained on time stamped EHR data from 260 thousand patients and 14,805 physicians over 8 years. The network is able to make multilabel predictions (one label for each diagnosis or medication category). The system can perform differential diagnosis with up to 79% recall, significantly higher than several baselines.

http://arxiv.org/pdf/1511.05942v9.pdf

Skin Evaluation and Recommendation

http://www.glossy.co/making-it-personal/olay-built-a-skin-evaluation-tool-to-help-drugstore-shoppers

Using Deep Learning to determine a customer’s “skin age,” identify problem areas and offer a regimen of products meant to address those issues.

Bioinformatics

http://www.mdpi.com/1422-0067/17/8/1313/htm

Drug design, virtual screening (VS), Quantitative Structure–Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining.

Art

http://iq.intel.com/getting-creative-ai-and-machine-learning/

Reducing Risk in Agriculture due to Climate Change

http://www.slideshare.net/ErikAndrejko/deep-learninginagriculture

Mapping Poverty using Satellite Data

https://news.developer.nvidia.com/deep-learning-and-satellite-data-helping-map-poverty

Discover New Compression Algorithms

http://arxiv.org/abs/1608.05148 Full Resolution Image Compression with Recurrent Neural Networks

This is the first neural network architecture that is able to outperform JPEG at image compression across most bitrates on the rate-distortion curve on the Kodak dataset images, with and without the aid of entropy coding.

http://www.theverge.com/2016/9/26/13055938/ai-pop-song-daddys-car-sony Writing a pop song.

Transfiguring Portraits

Place your face into another portrait.

http://homes.cs.washington.edu/~kemelmi/Transfiguring_Portraits_Kemelmacher_SIGGRAPH2016.pdf

Speech Synthesis

https://deepmind.com/blog/wavenet-generative-model-raw-audio/

Blur Out Background in Photographs

http://www.theverge.com/2016/9/8/12839838/apple-iphone-7-plus-ai-machine-learning-bokeh-photography

Predicting Corporate Bankruptcies

http://onlinelibrary.wiley.com/doi/10.1111/jbfa.12218/full

YouTube Recommendations

http://research.google.com/pubs/pub45530.html

Sorting Cucumbers

http://www.freshplaza.com/article/162739/Farmer-develops-cucumber-sorting-machine-with-the-help-of-Google

Reducing Traffic

http://www.engineering.com/DesignerEdge/DesignerEdgeArticles/ArticleID/13111/Machine-Learning-Techniques-Aim-to-Reduce-Traffic.aspx

Reverse Engineering Biological Processes

http://phys.org/news/2015-06-planarian-regeneration-artificial-intelligence.html

Realtime Facial Transfer

Research at Stanford shows how you can transfer your expressions into someone else's face. This is not a deep learning application, but I would not be surprised if a deep learning system could do something similar. http://graphics.stanford.edu/~niessner/thies2015realtime.html realtime facial transfer. Not realtime, but using Deep Learning: https://arxiv.org/pdf/1610.05586v1.pdf

Fast Face-swap Using Convolutional Neural Networks

https://arxiv.org/abs/1611.09577

Swap Nicholas Cage and Taylor Swift into another person's face.

Virtual Assistant

https://x.ai/a-peek-at-x-ais-data-science-architecture

Analysis of Disaster Damage

http://www.purdue.edu/newsroom/releases/2016/Q4/automated-method-allows-rapid-analysis-of-disaster-damage-to-structures.html

Realtime Conversational Assistance

http://www.huffingtonpost.com/adi-gaskell/machine-learning-and-the-_b_12652122.html

Detect Fashionable Clothing

http://qz.com/821512/artificial-intelligence-for-fashion/

Baby Sleep Monitor

https://blogs.nvidia.com/blog/2016/10/30/babbycam-baby-monitor-deep-learning/

Voice Conversion

https://arxiv.org/abs/1610.08927v1 Voice Conversion using Convolutional Neural Networks

Music Genre Classification

https://18798-presscdn-pagely.netdna-ssl.com/ismir2016/wp-content/uploads/sites/2294/2016/07/159_Paper.pdf

Photorealistic Facial Texture Inference

http://www.creativeai.net/posts/Hvjbie4sbDJkLjw4S/photorealistic-facial-texture-inference-using-deep-neural

Music Classification

https://arxiv.org/abs/1611.09827v1

DeepHealth

http://www.nature.com/articles/srep26094

Image Editing

https://www.youtube.com/watch?v=KXmZ39brkzE

https://arxiv.org/pdf/1702.06683.pdf Using Deep Learning and Google Street View to Estimate the Demographic Makeup of the US

Story Points (Task Estimation)

https://arxiv.org/abs/1609.00489

Other Vision Applications https://github.com/kjw0612/awesome-deep-vision

Scene Text Erase

https://arxiv.org/abs/1705.02772v1

Visual Product Discovery

http://www.sentient.ai/aware/

https://arxiv.org/abs/1702.04680

Spatial-Temporal Recurrent Neural Network for Emotion Recognition

https://arxiv.org/abs/1705.04515

Facial Animation

https://arstechnica.com/gaming/2017/08/nvidia-remedy-neural-network-facial-animation/

Crowdturfing

https://arxiv.org/pdf/1708.08151.pdf Automated Crowdturfing Attacks and Defenses in Online Review Systems

Watermark Removal

https://watermark-cvpr17.github.io/

The Conditional Analogy GAN: Swapping Fashion Articles on People Images

https://arxiv.org/pdf/1709.04695v1.pdf

Inspection

https://www.inc.com/kate-l-harrison/yale-student-invents-drone-to-solve-25-trillion-corrosion-problem.html?cid=+sf01003&sr_share=linkedin

See Behind Walls

Neural network identification of people hidden from view with a single-pixel, single-photon detector

https://arxiv.org/abs/1709.07244

Chemical Synthesis

Learning to Plan Chemical Syntheses https://arxiv.org/pdf/1708.04202.pdf

Smart Mirror Makeup

https://arxiv.org/pdf/1709.07566.pdf

https://www.linkedin.com/pulse/your-expertise-longer-needed-sincerely-deep-ben-taylor-ai-hacker

Reading Text in the Wild http://www.robots.ox.ac.uk/~vgg/research/text/#sec-models

http://linkis.com/www.nextplatform.com/inFee

https://estranhosidade.wordpress.com/2016/02/20/the-automation-of-the-technical-part-of-art-the-use-of-artificial-intelligence-in-the-artistic-creation/ THE AUTOMATION OF THE “TECHNICAL” PART OF ART: THE USE OF ARTIFICIAL INTELLIGENCE IN THE ARTISTIC CREATION

https://news.ycombinator.com/item?id=13159908

http://www.yaronhadad.com/deep-learning-most-amazing-applications

http://www.cim.mcgill.ca/~mrl/pubs/saul/egsr04.pdf Sketch Interpretation and Refinement Using Statistical Models

https://arxiv.org/abs/1709.05424v1 NIMA: Neural Image Assessment

https://arxiv.org/abs/1802.02511v1 DeepHeart: Semi-Supervised Sequence Learning for Cardiovascular Risk Prediction

https://www.biorxiv.org/content/early/2018/02/14/265231.full.pdf+html END-TO-END DIFFERENTIABLE LEARNING OF PROTEIN STRUCTURE

https://arxiv.org/abs/1802.06006 Voice cloning

http://www.cs.columbia.edu/cg/fontcode/

https://users.cg.tuwien.ac.at/zsolnai/gfx/gaussian-material-synthesis/

https://blogs.technet.microsoft.com/machinelearning/2018/10/03/snip-insights-an-open-source-cross-platform-ai-tool-for-intelligent-screen-capture/

Watermark removal

https://arxiv.org/pdf/1803.04189.pdf

https://arxiv.org/abs/1811.08009v1 Logo Detection

Caricature Drawing https://ai.stanford.edu/~kaidicao/carigan.pdf

Lung cancer

https://github.com/ncoudray/DeepPATH

Assistive Creativity

Generative Creativity FontCode: Embedding Information in Text Documents using Glyph Perturbation