Directed at those who want to use Delta Lake for ETL processing on data lakes. The course ends with a capstone project building a complete data pipeline using Delta Lake
After taking this class, students will be able to:
- Use the interactive Databricks notebook environment.
- Use Delta Lake to create, append and upsert data into a data lake.
- Use Delta Lake to manage and extract actionable insights out of a data lake.
- Use Databricks advanced optimization features to speed up queries.
- Seamlessly ingest streaming and historical data.
- Implement a data pipeline using Delta Lake.
Data engineers, software engineers, dev-ops, IT operations, and team-leads with experience using Databricks.
Must have completed, or already have similar knowledge in the following:
SP800: Getting Started with Apache Spark™ DataFrames Azure | AWS
SP805: Getting Started with Apache Spark SQL Azure | AWS
SP820: ETL Part 1 – Data Extraction Azure | AWS
SP821: ETL Part 2 – Transformations and Loads AWS | Azure
All participants will need :
a laptop with updated versions of Chrome or Firefox (Internet Explorer and Safari are not supported)
an internet connection which can support use of GoToTraining.
GoToTraining is the online platform via which the class will be delivered and prior to attendance, each registrant will receive GoToTraining log-in instructions.
For more information and to confirm your computer can run GoToTraining, please check here: Validation