A brief overview of Deep Learning


I’m putting together an outline for a presentation I’m giving to NVIDIA soon, as part of the certification process for their Deep Learning Institute. I plan on breaking this into a 5 minute talk, as well as a 30 minute session, to give developers a high-level overview of Deep Learning.



  • What is deep learning?

    • First conceived in the 1950s, it is a  class or subset of machine learning algorithms that learns by using a large, many-layered collection of connected processes and exposing these processors to a vast set of examples.  One of its primary attributes is the ability to identify patterns in unstructured data.


  • Why a sudden resurgence?

    • Advanced algorithms are developing as a result of rapid improvements in fast information storage capacity, high computing power, and parallelization


  • Use cases & business applications
    • Use Cases: Computer vision, voice recognition, and natural language processing (NLP).


  • Business Applications:  Text-based searches, fraud detection, handwriting recognition, image search, and translation.


  • Problems lending themselves to DL include medical diagnosis, demand prediction, malware detection, self-driving cars, customer churn, and failure prediction


  • Shortcomings

    • Can be expensive and tricky to set up, and the requirement of a large amount of data to train neural networks.


  • Still a very immature market, and most organizations lack the necessary data science skills for even simple machine learning solutions, let alone for deep learning.


  • It’s not clear upfront if deep learning will solve a given problem at all – there is simply no mathematical theory available that indicates if a “good enough” deep learning solution even exists.


  • DL vs ML

    • A deep learning model is able to learn on its own, while a standard machine learning model would need to be told how it should make an accurate prediction (by feeding it more data).


  • Conceptually Deep Learning is like machine learning but different because it can work directly on digital representations of data


  • Deep learning has the potential to limit human biases that go into choosing inputs, but also to find measures that are more meaningful than the input machine learning relies on today


  • Algorithms

    •  Deep neural networks (DNNs): The dominant deep learning algorithms, which are neural networks constructed from many layers (“deep”) of alternating linear and nonlinear processing units


  • Random Decision Forests (RDFs). Also constructed from many layers, but instead of neurons the RDF is constructed from decision trees and outputs a statistical average of the individual trees.

Humble Beginnings


  • Neural Nets

    • First conceived in the 1950,  although many of the key algorithmic advances occurred in the 1980s and 1990s.


  • Boltzmann Machine

    •  Terry Sejnowski developed the basic algorithms called a Boltzmann machine in the early 1980s, which is a network of symmetrically connected, neuron-like units that make stochastic decisions about whether to be on or off.


  • The term “deep learning”

    • Only started gaining acceptance after the publication of a paper by University of Toronto Professor Geoffrey Hinton and his graduate student Ruslan Salakhutdinov. In 2006 they showed that neural nets could be adequately pre-trained one layer at a time to accelerate consecutive supervised learning, which would then fine-tune the outcome

 Current State of the Market


  • Open Source frameworks

    • Caffe, Deeplearning4j, MXNet, Google’s TensorFlow, and Theano


  • Hardware

    • Google revealed in May 2016 that it had been secretly using its own tailor-made chips called Tensor Processing Units (TPUs) to implement applications trained by deep learning
      • A tensor is a multi-dimensional array


  • NVIDIA is a pioneer in the space, with their Kepler GPUs powering Microsoft and Amazon’s cloud, as well as Jetson TK-x & DGX-1 hardware.


  • In July 2017, Harry Shum, a Microsoft EVP,  showed off a new chip created for HoloLens that includes a module custom-designed to efficiently run deep learning software.


  • FPGAs (field-programmable gate arrays) are also expected to witness considerable growth attributed to their ability to provide a higher performance per watt of power consumption as compared to GPUs.


  • Big Demand

    • Given their size and the number of advertisers they have, Google and Facebook can afford to hire the most accomplished deep learning talent and pay them handsomely. According to Microsoft CVP Peter Lee, there’s a “bloody war for talent in this space.”


  •  Deep learning represented almost half of all enterprise AI revenue in 2016, according to Tractica,






Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.