

1. Get started with Azure Databricks:
Azure Databricks is a cloud service that provides a scalable platform for data analysis using Apache Spark.
2. Understanding the Architecture of Azure Databricks
This module describes the hierarchical architecture of Azure Databricks, covering the separation of the control and compute layers, the account hierarchy, and various storage options, including Unity Catalog managed storage.
3. Understanding Azure Databricks Integrations
Learn how Azure Databricks integrates with various Microsoft services, such as Fabric, Power BI, and Copilot Studio, to deliver end-to-end solutions for data engineering, analytics, and AI.
4. Select and configure compute resources in Azure Databricks
Learn how to select and configure compute options in Azure Databricks to optimize them for different workloads, manage performance settings and access permissions, and secure serverless and classic compute resources.
5. Creating and Organizing Objects in the Unity Catalog
This module covers how to use the Unity Catalog’s three-tier namespace (catalogs, schemas, and objects) to organize data resources, create tables and volumes, and configure AI/BI Genie statements to improve data discoverability.
6. Securing Unity Catalog Objects
Learn how to secure Unity Catalog objects using centralized governance and security features such as access control, granular permissions, row/column filtering, and data access authentication via service principals.
7. Governance of Unity Catalog Objects
This section covers basic governance procedures in Unity Catalog, including implementing granular access control, tracking data lineage, configuring audit logs, and securely sharing data to monitor and manage your data assets.
8. Designing and Implementing Data Modeling with Azure Databricks
This module focuses on effective data modeling in Azure Databricks using Unity Catalog and covers designing ingestion logic, selecting tools and formats, implementing partitioning and clustering, and managing slowly changing dimensions.
9. Load data into Unity Catalog
Discover comprehensive data loading techniques in Azure Databricks for loading data into Unity Catalog tables, including managed connectors, custom code, SQL batch loading, streaming ingestion, Auto Loader, and orchestration with Lakeflow Spark Declarative Pipelines.
10. Clean, transform, and load data into Unity Catalog
This module covers basic data engineering techniques for cleaning and transforming raw data, including data quality profiling, value resolution, filtering, aggregation, record combination and reshaping, as well as loading transformed data using append, overwrite, and merge strategies.
11. Implementing and Managing Data Quality Constraints with Azure Databricks
This session explores strategies for maintaining high data quality in Azure Databricks, with a focus on implementing validation checks, enforcing schemas, managing schema drift, and using pipeline expectations for data integrity.
12. Designing and Implementing Data Pipelines with Azure Databricks
Learn how to use notebooks and Lakeflow Spark Declarative Pipelines to design and implement robust data pipelines in Azure Databricks, covering topics such as orchestration, error handling, and task logic.
13. Implementing Lakeflow Jobs with Azure Databricks
This module focuses on implementing Lakeflow jobs in Azure Databricks, guiding you through creating jobs, configuring triggers and schedules, setting up alerts, and managing automatic restarts to ensure reliable execution of data pipelines.
14. Implementing Development Lifecycle Processes in Azure Databricks
This module covers the implementation of development lifecycle processes in Azure Databricks using Git repositories for version control and Databricks Asset Bundles for infrastructure-as-code deployments, including branching workflows, testing, and CLI-based deployment.
15. Monitoring, Troubleshooting, and Optimizing Workloads in Azure Databricks
Learn how to monitor, troubleshoot, and optimize data workloads in Azure Databricks to ensure reliability and cost efficiency. You’ll analyze cluster usage, diagnose Spark jobs, optimize performance, and forward logs to Azure Log Analytics.
Requirements:
This course consists of training training and is led by a trainer who supervises the participants live. Theory and practice are taught with live demonstrations and practical exercises. The video conferencing software Zoom is used.
Prepare for the "Microsoft Certified: Azure Databricks Data Engineer Associate (beta)" exam with this course.
This course is designed for data engineers who have a basic knowledge of data analysis concepts, a fundamental understanding of cloud storage, and familiarity with the principles of data organization.
Form of learning
Learning form
No filter results
The training is conducted in collaboration with an authorized training partner. This partner collects and processes data under its own responsibility. Please review the relevant privacy policy .
