Delta Lake Rapid Start with Spark SQL

Purchase

To purchase this eLearning please click "Start" below. If you are purchasing for someone else please check "This is for someone else".

The eLearning is free.


FREE

Summary

Delta Lake is a robust storage solution designed specifically to work with Apache Spark™. In this course, you learn and use the primary methods for working with Delta Lake using Spark SQL.™.

Description

Upon completion of this course, you will be able to:

  • Use Delta Lake to create a new Delta table and to convert an existing Parquet-based data lake table
  • Differentiate between a batch append and an upsert to a Delta table
  • Use Delta Lake Time Travel to view different versions of a Delta tables
  • Execute a MERGE command to upsert data into a Delta table

Who should take this course?

  • Data Analysts
  • Data Scientists
  • Data Engineers interested in data pipelines built using Spark SQL alone

Prerequisite knowledge required

  • How to upload data into a Databricks workspace
  • How to visualize data using Databricks
  • Intermediate level Spark SQL usage including the CTAS pattern, use of Spark SQL functions such as from_unixtime, lag, and lead, and partitioning
  • The fundamentals of Delta Lake