Fundamentals of SQL on Databricks

Summary

Learn the benefits of using and get hands-on practice with Spark SQL on Databricks.

Description

Databricks, a managed platform for running Apache Spark, provides a premier environment for processing SQL workloads. Spark SQL is a Spark module for structured data processing. It can act as a distributed SQL query engine, enabling queries to run up to 100x faster on existing deployments and data. Users with a classical SQL background can immediately begin to work in the Databricks SQL environment. Using Spark SQL on Databricks has multiple advantages over using SQL with traditional tools.

Learning objectives

  • Compare Spark SQL on Databricks to other SQL tools.

  • Describe basic Spark Architecture.

  • Copy and run queries in a Databricks notebook.

  • Explain how common functions and Databricks tools can be applied to upload, view, and visualize data.

  • Identify steps to access the Spark UI from an error message.

Prerequisites

  • Beginning experience working in Databricks

  • Beginning experience working with business intelligence tools is helpful

Learning path

  • This course is part of the SQL Analyst learning path.

Proof of completion

  • Upon 80% completion of this course, you will receive a proof of completion.