This 2-day course will teach you best practices for using Databricks to build data pipelines, through lectures and hands-on labs. At the end of the course, you will have all the knowledge and skills that a data engineer would need to build an end-to-end Delta Lake pipeline for streaming and batch data, from raw data ingestion to consumption by end users.
This course uses a case study driven approach to explore the fundamentals of Spark Programming with Databricks, including Spark architecture, the DataFrame API, Structured Streaming, and query optimization.
In this course data analysts and data scientists practice the full data science workflow by exploring data, creating features and building models, performing hyperparameter tuning, and tracking parameters and managing models with MLflow. By the end of this course, you will have built end-to-end machine learning models ready for production.
In this 1-day course, data scientists and data engineers learn best practices for managing experiments, projects, and models using MLflow. Students build a pipeline to log and deploy machine learning models, as well as explore common production issues faced when deploying machine learning solutions and monitoring these models once they have been deployed into production.
This 3-day course provides a thorough review of the Apache Spark framework, including the "Spark fundamentals" with specific emphasis on skills development and the unique needs of a Data Engineering team through the use of lecture and hands-on labs.
This course is to be replaced by Scalable Machine Learning with Apache Spark
This 3-day course provides an introduction to the "Spark fundamentals," the "ML fundamentals," and a cursory look at various Machine Learning and Data Science topics with specific emphasis on skills development and the unique needs of a Data Science team through the use of lecture and hands-on labs.