DB 401 - Hands on Deep Learning with Keras, TensorFlow, and Apache Spark™

DB 401 - Hands on Deep Learning with Keras, TensorFlow, and Apache Spark™


This course offers a thorough, hands-on overview of deep learning and how to scale it with Apache Spark.


This course covers the fundamentals of neural networks and how to build distributed deep learning models on top of Spark. Throughout the class, you will use Keras, TensorFlow, MLflow, and Horovod to build, tune, and apply models.


2 Days


Upon completion, students should be able to:

  • Build Deep Learning models using Keras/TensorFlow
  • Scale & train distributed models using Horovod
  • Apply models at scale using Vectorized User Defined Functions
  • Track experiments using MLflow
  • Apply model interpretability libraries to understand & visualize model predictions
  • Perform Transfer Learning to speed up model training time & convergence
  • Tune hyperprameters at scale with Hyperopt
  • Perform image classification and object detection tasks with Deep Learning
  • Implement Generative Adversarial Networks


  • This course is ideal for data scientists that are interested to apply Deep Learning at scale
  • This course is suitable for data analysts and machine learning engineers


Prerequisite Knowledge:

  • Familiarity with Python is required
  • Experience with Pandas and Numpy are required
  • Familiarity with Spark is helpful
  • Familiarity with Machine Learning concepts is suggested

Prerequisite Courses:


Software & Hardware Requirements

  • Web Browser: Chrome
  • An Internet Connection
  • GoToTraining (for remote classes only)
    Please see the GoToMeeting System Check
  • A computer, laptop, or tablet with a keyboard

Additional Notes

  • The appropriate, web-based programming environment will be provided to students
  • This class is taught in Python only
  • For the public classes, this course is often scheduled over two half-days


  • Spark Review
  • Build a linear regression model using scikit-learn and reimplement it in Keras, modify # of epochs, visualize loss
  • Modify these parameters for increased model performance: activation functions, loss functions, optimizer, batch size
  • Perform data standardization for better model convergence, create custom metrics, add validation datag, generate model checkpointing/callbacks, use TensorBoard, and save and load models
  • Log experiments with MLflow, view MLflow UI, and generate a UDF with MLflow and apply to a Spark DataFrame
  • Use Horovod to train a distributed neural network using Parquet files and Petastorm
  • Combine user and item factors identified from ALS and use as input to a neural network, create custom activation function (scaled sigmoid) to bound output of regression tasks, and train distributed neural network using Horovod
  • Use LIME and SHAP to understand which features are most important in the model's prediction for that data point
  • Analyze popular CNN architectures and apply pre-trained CNNs to images using Pandas Scalar Iterator UDF
  • Use LIME to understand how the CNN model makes a prediction
  • Perform transfer learning to create a cat vs dog classifier
  • Use Hyperopt to train and optimize a feed-forward neural net
  • Get familiar with a few popular papers on Deep Learning in practice
  • Detect objects in an image use FasterRCNN
  • Learn about generative and discriminative models and get hands-on experience on creating GANS

Upcoming Classes

Jul 30 - 31
9:00 AM - 5:00 PM
Eastern Daylight Time
Online - Virtual - US Eastern
$ 1500.00 USD
Sep 17 - 18
9:00 AM - 5:00 PM
Pacific Daylight Time
Online - Virtual - US Pacific
$ 1500.00 USD
Oct 29 - 30
9:00 AM - 5:00 PM
Eastern Daylight Time
Online - Virtual - US Eastern
$ 1500.00 USD
Dec 17 - 18
9:00 AM - 5:00 PM
Eastern Standard Time
Online - Virtual - US Eastern
$ 1500.00 USD