Introduction to Natural Language Processing

Summary

An introduction to natural language processing with Databricks.

Description

This course will introduce you to natural language processing with Databricks. You will learn how to generate term-frequency-inverse-document-frequency (TFIDF) vectors for your datasets and how to perform latent semantic analysis using the Databricks Machine Learning Runtime.

Learning objectives

  • Describe foundational concepts about how latent semantic analysis is used to analyze text data.

  • Perform latent semantic analysis using the Databricks Machine Learning Runtime with the Databricks Workspace.

  • Generate TFIDF vectors to reduce the noise in a dataset being used for latent semantic analysis in a Databricks Workspace.

Prerequisites

  • Intermediate experience performing machine learning/data science workflows

  • Intermediate experience using the Databricks Data Science Workspace to perform machine learning workflows

Learning path

  • This course is part of the Data Scientist learning path.

Proof of completion

  • Upon 80% completion of this course, you will receive a proof of completion. 

 

Part of Learning Pathway(s)