DB 105 - Apache Spark™ Programming

DB 105 - Apache Spark™ Programming

Summary

This course covers the fundamentals of Apache Spark including Spark’s architecture and internals, the core APIs for using Spark, SQL and other high-level data access tools, as well as Spark’s streaming capabilities and machine learning APIs. The class is a mixture of lecture and hands-on labs.

Duration

3 Days

Objectives

After taking this class, students will be able to:

  • Use the core Spark APIs to operate on data
  • Articulate and implement typical use cases for Spark
  • Build data pipelines and query large data sets using Spark SQL and DataFrames
  • Analyze Spark jobs using the administration UIs inside Databricks
  • Create Structured Streaming jobs
  • Work with relational data using the GraphFrames APIs
  • Understand how a Machine Learning pipeline works
  • Understand the basics of Spark’s internals

Audience

Data analysts who want to learn the fundamentals of programming with Apache Spark, how to streamline their big data processing, build production Spark jobs, and understand/debug running Spark applications.

Prerequisites

  • Some familiarity with Apache Spark is helpful but not required.
  • Knowledge of SQL is helpful.
  • Basic programming experience in an object-oriented or functional language is required. The class can be taught concurrently in Python and Scala.

Additional Notes

All ​participants ​will ​need ​:

  • an ​internet ​connection

  • a ​device ​that is compliant with the following supported internet browsers

  • to ​confirm ​​​your ​​​device ​​​can ​​​run ​​​GoToTraining : ​ Validate

  • NOTE: GoToTraining ​is ​our chosen online ​platform ​through which the ​class ​will ​be ​delivered and ​prior ​to ​attendance, ​each ​registrant ​will ​receive ​GoToTraining ​log-in ​instructions.

  • Upcoming Classes

    Date
    Time
    Location
    Price
    Jan 7 - 9
    9:00 AM - 5:00 PM
    Pacific Standard Time
    San Francisco , United States
    $ 2500.00 USD
    Jan 7 - 9
    9:00 AM - 5:00 PM
    Pacific Standard Time
    Online - Virtual - US Pacific
    $ 2500.00 USD
    Feb 11 - 13
    9:00 AM - 5:00 PM
    Eastern Standard Time
    McLean , United States
    $ 2500.00 USD
    Feb 11 - 13
    9:00 AM - 5:00 PM
    Eastern Standard Time
    Online - Virtual - US Eastern
    $ 2500.00 USD
    Mar 17 - 19
    9:00 AM - 5:00 PM
    Pacific Daylight Time
    San Francisco , United States
    $ 2500.00 USD
    Mar 17 - 19
    9:00 AM - 5:00 PM
    Pacific Daylight Time
    Online - Virtual - US Pacific
    $ 2500.00 USD
    Apr 21 - 23
    9:00 AM - 5:00 PM
    Eastern Daylight Time
    Edison , United States
    $ 2500.00 USD
    Apr 21 - 23
    9:00 AM - 5:00 PM
    Eastern Daylight Time
    Online - Virtual - US Eastern
    $ 2500.00 USD
    May 26 - 28
    9:00 AM - 5:00 PM
    Pacific Daylight Time
    San Francisco , United States
    $ 2500.00 USD
    May 26 - 28
    9:00 AM - 5:00 PM
    Pacific Daylight Time
    Online - Virtual - US Pacific
    $ 2500.00 USD
    Jun 29 - Jul 1
    9:00 AM - 5:00 PM
    Eastern Daylight Time
    McLean , United States
    $ 2500.00 USD
    Jun 29 - Jul 1
    9:00 AM - 5:00 PM
    Eastern Daylight Time
    Online - Virtual - US Eastern
    $ 2500.00 USD

    Onsite Training

    Request Quote

    Public Training

    San Francisco, CA

    Virtual - US Pacific

    McLean, VA

    Virtual - US Eastern

    Edison, NJ


    Don't see a date that works for you?

    Request Class

    DB 105 - Apache Spark™ Programming Ratings

    Training Organized
    Training Objectives
    Training Expectations
    Training Curriculum
    Training Labs
    Training Overall

    What do these ratings mean?