DB 105 - Apache Spark™ Programming
After taking this class, students will be able to:
- Use the core Spark APIs to operate on data
- Articulate and implement typical use cases for Spark
- Build data pipelines and query large data sets using Spark SQL and DataFrames
- Analyze Spark jobs using the administration UIs inside Databricks
- Create Structured Streaming jobs
- Work with relational data using the GraphFrames APIs
- Understand how a Machine Learning pipeline works
- Understand the basics of Spark’s internals
Data analysts who want to learn the fundamentals of programming with Apache Spark, how to streamline their big data processing, build production Spark jobs, and understand/debug running Spark applications.
- Some familiarity with Apache Spark is helpful but not required.
- Knowledge of SQL is helpful.
- Basic programming experience in an object-oriented or functional language is required. The class can be taught concurrently in Python and Scala.
All participants will need :
British Summer Time
Before Jul 09, 2019 9:00AM BST