DB 401 - Hands on Deep Learning with Keras, TensorFlow, and Apache Spark™

DB 401 - Hands on Deep Learning with Keras, TensorFlow, and Apache Spark™

Summary

This course offers a thorough, hands-on overview of deep learning and how to scale it with Apache Spark.

Description

This course covers the fundamentals of neural networks and how to build distributed deep learning models on top of Spark. Throughout the class, you will use Keras, TensorFlow, MLflow, and Horovod to build, tune, and apply models. This course is taught entirely in Python.

Duration

2 Days

Objectives

Upon completion, students will be able to:

  • Build Deep Learning models using Keras/TensorFlow
  • Scale & train distributed models using Horovod
  • Apply models at scale using Vectorized User Defined Functions
  • Perform Transfer Learning to speed up model training time & convergence
  • Apply model interpretability libraries to understand & visualize model predictions

Audience

This course is aimed at the practicing data scientist who is eager to get started with deep learning, as well as software engineers and technical managers interested in a thorough, hands-on overview of deep learning and its integration with Apache Spark.

Prerequisites

Prerequisite Knowledge

  • Basic programming constructs of Python & PySpark
  • Working knowledge of ML concepts (e.g. regression, classification, evaluation metrics, etc.)

Prerequisite Courses

  • DB 100, 105, or 301

Additional Notes

Software & Hardware Requirements

  • A computer or laptop with an internet connection
  • Web Browser: Chrome
  • For virtual classes/students, the ability to run GoToTraining.

Outline

Day 1 - AM

Time Lesson Description
30m Introductions & Setup Registration, Courseware & Q&As
30m Spark Review Create a Spark DataFrame; Analyze the Spark UI; Cache data; Change Spark default configurations to speed up queries
10m Break
35m Linear Regression Build a linear regression model using Sklearn and reimplement it in Keras; Modify # of epochs; Visualize loss
30m Keras Modify these parameters for increased model performance: Activation functions, Loss functions, Optimizer, Batch Size
10m Break
30m Keras Lab Build and evaluate your first Keras model! (Students use Boston Housing Dataset, Instructor uses California Housing)

Day 1 - PM

Time Lesson Description
35m Advanced Keras Perform data standardization for better model convergence; Create custom metrics; Add validation data; Generate model checkpointing/callbacks; Save and load models
30m Advanced Keras Lab Perform data standardization; Generate a separate train/validation dataset; Create earlycheckpointing callback; Load and apply your saved model
10m Break
30m MLflow Log experiments with MLflow; View MLflow UI; Generate a UDF with MLflow and apply to a Spark DataFrame
20m MLflow Lab Add MLflow to your experiments from the Boston Housing Dataset! Bonus: Create LambdaCallback to log MLflow metrics while the model is training (after each epoch); Create a UDF that you can invoke in SQL; Get the lowest MSE!
10m Break
40m Horovod Use Horovod to train a distributed neural network; Distributed Deep Learning best practices

Day 2 - AM

Time Lesson Description
20m Review Review of Day 1
30m Horovod Petastorm Use Horovod to train a distributed neural network using Parquet files + Petastorm
10m Break
40m Horovod ALS Combine User + Item factors identified from ALS and use as input to a neural network; Create custom activation function (scaled sigmoid) to bound output of regression tasks; Train distributed neural network using Horovod
45m Horovod Lab Prepare your data for use with Horovod; Distribute the training of our model using HorovodRunner; Use Parquet files as input data for our distributed deep learning model with Petastorm + Horovod
10m Break
35m Model Interpretability Use LIME and SHAP to understand which features are most important in the model’s prediction for that data point

Day 2 - PM

Time Lesson Description
45m CNNs Analyze popular CNN architectures; Apply pre-trained CNNs to images using Pandas Scalar Iterator UDF
20m Lime for CNNs Use LIME to visualize how the CNN makes predictions
10m Break
30m Transfer Learning Perform transfer learning to create a cat vs dog classifier
20m Transfer Learning Lab Build a model with nearly perfect accuracy predicting if a patient has pneumonia or not using transfer learning
10m Break
20m HyperOpt Use HyperOpt to train and optimize a feed-forward neural net
25m Best Practices Discuss DL best practices, state of the art, and new research areas

Upcoming Classes

Date
Time
Location
Price
Nov 25 - Nov 26
9:00 AM - 5:00 PM
Greenwich Mean Time
Online - Virtual Class - GMT Time
$ 2000.00 USD
Dec 16 - Dec 17
9:00 AM - 5:00 PM
Pacific Standard Time
Online - Virtual Class - US Pacific Time
$ 2000.00 USD
Jan 23
9:00 AM - 5:00 PM
Eastern Standard Time
McLean , United States
$ 2000.00 USD
Jan 23
9:00 AM - 5:00 PM
Eastern Standard Time
Online - Virtual Class - US Eastern Time
$ 2000.00 USD
Mar 12
9:00 AM - 5:00 PM
Pacific Daylight Time
San Francisco , United States
$ 2000.00 USD
Mar 12
9:00 AM - 5:00 PM
Pacific Daylight Time
Online - Virtual Class - US Pacific Time
$ 2000.00 USD
May 7
9:00 AM - 5:00 PM
Eastern Daylight Time
McLean , United States
$ 2000.00 USD
May 7
9:00 AM - 5:00 PM
Eastern Daylight Time
Online - Virtual Class - US Eastern Time
$ 2000.00 USD
Jun 25
9:00 AM - 5:00 PM
Pacific Daylight Time
San Francisco , United States
$ 2000.00 USD
Jun 25
9:00 AM - 5:00 PM
Pacific Daylight Time
Online - Virtual Class - US Pacific Time
$ 2000.00 USD

Onsite Training

Request Quote

Public Training

Virtual Class - GMT Time

Virtual Class - US Pacific Time

McLean, VA

  • 9:00 AM - 5:00 PM
    $ 2000.00 USD
  • 9:00 AM - 5:00 PM
    $ 2000.00 USD

Virtual Class - US Eastern Time

  • 9:00 AM - 5:00 PM EST
    $ 2000.00 USD
  • 9:00 AM - 5:00 PM EDT
    $ 2000.00 USD

San Francisco, CA

  • 9:00 AM - 5:00 PM
    $ 2000.00 USD
  • 9:00 AM - 5:00 PM
    $ 2000.00 USD

Don't see a date that works for you?

Request Class