SP840: Managed Delta Lake (AWS Databricks)
This hands-on self-paced training course targets Data Engineers, Data Scientists and Data Analysts who want to use Managed Delta Lake for ETL processing on data lakes. The course ends with a capstone project building a complete data pipeline using Managed Delta Lake.
NOTE: This course is specific to the Databricks Unified Analytics Platform (based on Apache Spark™). While you might find it helpful for learning how to use Apache Spark in other environments, it does not teach you how to use Apache Spark in those environments.
3-6 hours, 75% hands-on practical experiences
Format: Self-paced eLearning
The course is a series of seven self-paced lessons plus a final capstone project building a complete data pipeline using Managed Delta Lake. Each lesson includes hands-on exercises.
This version of the course is intended to be run on Databricks on AWS.
During this course learners
- Use the interactive Databricks notebook environment.
- Create, append and upsert data into a data lake.
- Use Managed Delta Lake to manage and extract actionable insights out of a data lake.
- Use Databricks advanced optimization features to speed up queries.
- Seamlessly ingest streaming and historical data.
- Implement a data pipeline using Managed Delta Lake.
- Introducing Delta Lake
- Capstone Project
- Primary Audience: Data Engineers
SP800: Getting Started with Apache Spark™ DataFrames AWS
SP805: Getting Started with Apache Spark SQL AWS
SP820: ETL Part 1 – Data Extraction AWS
SP821: ETL Part 2 – Transformations and Loads AWS
- Please be sure to use a supported browser.
This self-paced training course may be used by 1 user for 12 months from the date of purchase. It may not be transferred or shared with any other user.