DB 100 - Apache Spark™ Overview

DB 100 - Apache Spark™ Overview

Summary

This 1-day course provides a brief introduction to the Apache Spark architecture, the DataFrames API, and one choice from several electives, covering the fundamentals of the Apache Spark framework.

Description

This course focuses on the fundamentals of the Apache Spark echo system. It aims to provide the foundational knowledge required for Data Analyst, Data Engineers, Data Scientist, ML Practitioners, or anyone interested in starting to develop with the Apache Spark framework.

The course starts with an introduction to the Spark architecture with an emphasis on high-level concepts such as Drivers, Executors, and Slots, as well as Applications, Jobs, Stages, and Tasks. As time permits, intermediate topics, such as DAG Execution, are also addressable.

The course continues with a crash course into the DataFrame APIs covering the "core" components such as the SparkSession, Readers & Writers, DataFrames, and the Spark SQL functions.

Additional, 1-hour electives are available covering more of the DataFrames API, Structured Streaming, or a demo of the Spark-ML packages for Machine Learning.

Duration

7 hours

Objectives

Upon completion, students should be able to:

  • Describe how Apache Spark's distributed design allows for the processing of Gigabytes to Terabytes of data
  • Apply basic intuition to the minor, albeit common, performance problems that new developers often encounter
  • Use the DataFrame APIs to ingest, alter and write data
  • Understand the breadth and depth of Apache Spark's capabilities

Optionally:

  • Create Structured Streaming jobs
  • Understand how the machine learning pipeline works

Audience

Anyone with a software development background that wants a quick introduction to the core Spark APIs and a basic introduction to the Apache Spark architecture.

Prerequisites

Prerequisites Knowledge:

  • Knowledge of SQL is helpful
  • Experience with either Python or Scala is required
  • Some familiarity with Apache Spark or other big-data processing frameworks is helpful but not required

Prerequisites Courses:

Software & Hardware Requirements

  • Web Browser: Chrome
  • An Internet Connection
  • GoToTraining (for remote classes only)
    Please see the GoToMeeting System Check
  • A computer, laptop, or tablet with a keyboard

Additional Notes

  • The appropriate, web-based programming environment will be provided to participants
  • Note: This class can be taught concurrently in Python and Scala

Outline

  • About Databricks, Spark
  • A high-level overview of the Spark Architecture
  • Spark Entry Points, Simple Data Ingestion & overview of API docs
  • Review different data ingestion options
  • Introduction to the "core" DataFrames APIs
  • Introduction to Spark's execution model
  • Hands-on exercises to familiarize participants with the Spark UI
Electives (select one):
  • Introduction to Structured Streaming
  • Introduction to the Machine Learning Pipeline
  • Deeper dive into the DataFrames APIs

Upcoming Classes

Date
Time
Location
Price
Apr 21
9:00 AM - 5:00 PM
Pacific Daylight Time
Online - Virtual - US Pacific
$ 1500.00 USD
May 26
9:00 AM - 5:00 PM
Pacific Daylight Time
San Francisco , United States
$ 1500.00 USD
May 26
9:00 AM - 5:00 PM
Pacific Daylight Time
Online - Virtual - US Pacific
$ 1500.00 USD
Jun 9
9:00 AM - 5:00 PM
Eastern Daylight Time
Edison , United States
$ 1500.00 USD
Jun 9
9:00 AM - 5:00 PM
Eastern Daylight Time
Online - Virtual - US Eastern
$ 1500.00 USD
Jun 29
9:00 AM - 5:00 PM
Eastern Daylight Time
McLean , United States
$ 1500.00 USD
Jun 29
9:00 AM - 5:00 PM
Eastern Daylight Time
Online - Virtual - US Eastern
$ 1500.00 USD
Jul 22
9:00 AM - 5:00 PM
Pacific Daylight Time
San Francisco , United States
$ 1500.00 USD
Jul 22
9:00 AM - 5:00 PM
Pacific Daylight Time
Online - Virtual - US Pacific
$ 1500.00 USD
Jul 27
9:00 AM - 5:00 PM
Eastern Daylight Time
McLean , United States
$ 1500.00 USD
Jul 27
9:00 AM - 5:00 PM
Eastern Daylight Time
Online - Virtual - US Eastern
$ 1500.00 USD
Aug 26
9:00 AM - 5:00 PM
Eastern Daylight Time
McLean , United States
$ 1500.00 USD
Aug 26
9:00 AM - 5:00 PM
Eastern Daylight Time
Online - Virtual - US Eastern
$ 1500.00 USD
Sep 14
9:00 AM - 5:00 PM
Pacific Daylight Time
San Francisco , United States
$ 1500.00 USD
Sep 14
9:00 AM - 5:00 PM
Pacific Daylight Time
Online - Virtual - US Pacific
$ 1500.00 USD
Sep 30
9:00 AM - 5:00 PM
Eastern Daylight Time
McLean , United States
$ 1500.00 USD
Sep 30
9:00 AM - 5:00 PM
Eastern Daylight Time
Online - Virtual - US Eastern
$ 1500.00 USD
Oct 26
9:00 AM - 5:00 PM
Eastern Daylight Time
McLean , United States
$ 1500.00 USD
Oct 26
9:00 AM - 5:00 PM
Eastern Daylight Time
Online - Virtual - US Eastern
$ 1500.00 USD
Nov 4
9:00 AM - 5:00 PM
Pacific Standard Time
San Francisco , United States
$ 1500.00 USD
Nov 4
9:00 AM - 5:00 PM
Pacific Standard Time
Online - Virtual - US Pacific
$ 1500.00 USD
Dec 9
9:00 AM - 5:00 PM
Eastern Standard Time
McLean , United States
$ 1500.00 USD
Dec 9
9:00 AM - 5:00 PM
Eastern Standard Time
Online - Virtual - US Eastern
$ 1500.00 USD
Dec 14
9:00 AM - 5:00 PM
Eastern Standard Time
McLean , United States
$ 1500.00 USD
Dec 14
9:00 AM - 5:00 PM
Eastern Standard Time
Online - Virtual - US Eastern
$ 1500.00 USD

Onsite Training

Request Quote

Public Training

Virtual - US Pacific

  • Confirmed
    9:00 AM - 5:00 PM PDT
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM PDT
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM PDT
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM PDT
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM PST
    $ 1500.00 USD

San Francisco, CA

  • 9:00 AM - 5:00 PM
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM
    $ 1500.00 USD

Edison, NJ

  • 9:00 AM - 5:00 PM
    $ 1500.00 USD

Virtual - US Eastern

  • 9:00 AM - 5:00 PM EDT
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM EDT
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM EDT
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM EDT
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM EDT
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM EDT
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM EST
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM EST
    $ 1500.00 USD

McLean, VA

  • 9:00 AM - 5:00 PM
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM
    $ 1500.00 USD
  • 9:00 AM - 5:00 PM
    $ 1500.00 USD

Don't see a date that works for you?

Request Class