Fundamentals of Structured Streaming

Summary

A high-level introduction to Structured Streaming with Apache Spark(™) .

Description

A common struggle that organizations face is how to accurately ingest and perform calculations on real-time data. This data is also referred to as streaming data, and the challenges behind working with it lie in its real-time nature - because it is constantly arriving, mechanisms must be put into place to process and write to a data store. In this course, you’ll learn about Structured Streaming, an Apache Spark API that helps data practitioners overcome the challenges of working with streaming data. We’ll cover fundamental concepts about batch and streaming data to help set the stage for our discussion on Structured Streaming. Then, we’ll discuss where Structured Streaming fits into an organization’s big data ecosystem. Finally, we’ll review real-world Structured Streaming business use cases.

Learning objectives

  • Explain the benefits of Structured Streaming for working with streaming data.

  • Distinguish where Structured Streaming fits into an organization’s big data ecosystem.

  • Articulate examples of real-world business use cases for Structured Streaming.

Prerequisites

  • Beginning knowledge about the Databricks Unified Data Analytics Platform (what it is, what it is used for)

  • Beginning knowledge about concepts related to the big data landscape (for example: structured streaming, batch processing, data pipelines)

  • Note: We recommend taking the following two Databricks Academy courses to help you prepare for this course: Fundamentals of Big Data and Fundamentals of Unified Data Analytics with Databricks.

Learning path

  • This course is part of the Business Leader learning path.

Proof of completion

  • Upon 80% completion of this course, you will receive a proof of completion.