Advanced Data Engineering with Databricks

Advanced Data Engineering with Databricks - Coming Soon!

Summary

COMING SOON (Early 2021)! This 2-day course is designed to enable Databricks users to design performant architectures with Structured Streaming and Delta Lake, with the goal of realizing the Lakehouse. You will learn best practices for developing and deploying modularized Python code for running Databricks-optimized Spark code.

Description

In this course, participants will learn about advanced topics in building and maintaining data engineering workloads on Databricks. This course will specifically build upon skills taught in the Data Engineering with Databricks course, and will aim to highlight those features of the Databricks platform that make it well-suited for production data engineering, with an emphasis on Spark 3, Delta Lake, Structured Streaming, and proprietary platform features.

Duration

2 Days

Objectives

Upon completion of this course, students should be able to:

  • Design and implement multi-pipeline multi-hop architecture to enable the Lakehouse paradigm.
  • Deploy Structured Streaming operations that take advantage of Databricks Job scheduling capabilities and avoid workspace limitations.
  • Implement Databricks-native code that leverages platform-specific Delta Lake features to simplify production workloads.
  • Master design patterns that enable common use cases, including change data capture (CDC), slowly changing dimensions (SCD), and managing personally identifiable information (PII).
  • Refactor notebook-based code into a testable framework implementing best practices for code development.

Audience

Data Engineers

Prerequisites

  • Advanced experience using Apache Spark
  • Advanced experience coding with Python
  • Intermediate experience writing SQL queries
  • Intermediate experience using Databricks platform
  • Intermediate experience using Delta Lake
  • Intermediate experience using Structured Streaming
  • Intermediate knowledge of data engineering concepts

Additional Notes

All ​participants ​will ​need ​:

  • an ​internet ​connection

  • a ​device ​that is compliant with the following supported internet browsers

  • to ​confirm ​​​your ​​​device ​​​can ​​​run ​​​GoToTraining : ​ Validate

  • NOTE: GoToTraining ​is ​our chosen online ​platform ​through which the ​class ​will ​be ​delivered and ​prior ​to ​attendance, ​each ​registrant ​will ​receive ​GoToTraining ​log-in ​instructions.

  • Upcoming Classes

    No classes have been scheduled.