What is the Medallion Architecture covered in this course?

Medallion Architecture is a layered data design pattern using Bronze (raw), Silver (cleaned), and Gold (analytics-ready) layers. The course includes hands-on pipeline building across all three layers using Delta Lake on Databricks.

What will I get if I purchase the Certificate?

When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Is financial aid available?

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

Lakehouse Architecture and Delta Lake with Databricks

Ends soon: Grow your skills with Coursera Plus for $239/year (usually $399). Save now.

Lakehouse Architecture and Delta Lake with Databricks

Instructor: Edureka

Included with

Learn more

4 modules

Gain insight into a topic and learn the fundamentals.

Intermediate level

Recommended experience

1 week to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

4 modules

Gain insight into a topic and learn the fundamentals.

Intermediate level

Recommended experience

1 week to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

What you'll learn

Design and implement Lakehouse architectures using Databricks and Delta Lake to replace legacy data platforms
Build end-to-end data pipelines using Medallion Architecture (Bronze, Silver, Gold) with incremental processing and Change Data Capture
Apply Delta Lake performance optimization techniques—including data skipping, file compaction, and Liquid Clustering—to support BI and ML workloads
Manage production-grade data reliability through ACID transactions, time travel, schema enforcement, and concurrency control

Skills you'll gain

Tools you'll learn

Details to know

Shareable certificate

Add to your LinkedIn profile

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

There are 4 modules in this course

Design and implement production ready Lakehouse architectures using Delta Lake and Databricks. By the end of this course, you will be able to build multi layer Medallion pipelines including Bronze, Silver, and Gold layers, manage ACID transactions, enforce and evolve schemas, implement Change Data Capture, and optimize Delta tables for performance using data skipping, compaction, and Liquid Clustering. You will also learn to unify batch and streaming workloads while ensuring reliability, scalability, and recoverability in enterprise environments.

This course stands out by going beyond Delta Lake syntax and focusing on end to end Lakehouse engineering, from architectural design patterns to production optimization and concurrency control. Through structured modules and hands on implementation, you will gain practical experience designing scalable data platforms that support both BI analytics and machine learning workloads. If you are a data engineer, analytics engineer, or platform architect looking to modernize legacy data warehouses or data lakes, this course provides the applied skills required to build efficient, cost effective, and future ready data infrastructure on Databricks. This is primarily aimed at first- and second-year undergraduates interested in engineering or science, along with high school students and professionals with an interest in programming.

This module introduces the evolution of modern data platforms, from traditional warehouses and data lakes to the unified Lakehouse architecture. Learners explore foundational concepts of Databricks, Apache Spark, and Delta Lake that enable scalable, reliable, and governed data processing.

What's included

15 videos5 readings4 assignments

15 videosTotal 79 minutes

Course Introduction5 minutes
Warehouses to Lakehouse Data Architecture6 minutes
Limitations of Traditional Data Warehouses and Data Lakes6 minutes
Introduction to Databricks5 minutes
Demonstration: Exploring Databricks Architecture6 minutes
What is Lakehouse Architecture and Why It Matters6 minutes
Principles of Lakehouse Architecture5 minutes
Data Lake vs. Data Warehouse vs. Lakehouse5 minutes
Demonstration: Databricks Workspace and Lakehouse Components Overview4 minutes
Introduction to Delta Lake on Databricks6 minutes
Delta Lake Architecture and Core Components6 minutes
ACID Transactions in Delta Lake6 minutes
Demonstration: Apache Spark Overview6 minutes
Demonstration: Creating and Querying First Delta Table on Databricks4 minutes
Demonstration: Understanding the Delta Transaction Log5 minutes

5 readingsTotal 70 minutes

Course Syllabus15 minutes
Evolution of Analytics Workloads and Data Consumption Patterns15 minutes
Organizational Transition from Data Lakes to Lakehouse Architectures15 minutes
Open Table Formats: Delta Lake, Iceberg, and Hudi15 minutes
Module Summary: Modern Data Architecture Fundamentals10 minutes

4 assignmentsTotal 33 minutes

Practice Knowledge Check: Modern Data Architecture Fundamentals6 minutes
Practice Knowledge Check: Lakehouse Architecture Fundamentals6 minutes
Practice Knowledge Check: Apache Spark, Delta Lake, and Environment Setup6 minutes
Knowledge Check: Modern Data Architecture Fundamentals15 minutes

This module focuses on the core operational capabilities of Delta Lake, including storage architecture, metadata management, transactional processing, and schema control. Learners gain hands-on experience with CRUD operations, incremental data pipelines, time travel, and streaming to build reliable, production-ready data workflows.

What's included

12 videos4 readings4 assignments

12 videosTotal 55 minutes

Introduction to Metadata in Delta Lake5 minutes
Schema Enforcement vs. Schema Evolution6 minutes
Demonstration: Inspecting Delta Table Versions and History4 minutes
Demonstration: Enforcing and Evolving Schema in Delta Tables4 minutes
Transactional Data Operations in Delta Lake7 minutes
Demonstration: Creating Managed and External Delta Tables7 minutes
Demonstration: Performing CRUD Operations on Delta Tables3 minutes
Demonstration: Implementing MERGE for Incremental Loads3 minutes
Time Travel and Versioned Data Access5 minutes
Streaming and Batch Unification in Delta Lake5 minutes
Demonstration: Data Auditing and Recovery Using Time Travel3 minutes
Demonstration: Streaming Reads and Writes with Delta Lake3 minutes

4 readingsTotal 60 minutes

Reading: Practical Usage of Delta Lake in Production Environments15 minutes
Designing Incremental Data Pipelines for Large-Scale Systems15 minutes
Data Lineage, Auditing, and Observability15 minutes
Module Summary: Core Delta Lake Concepts and Operations15 minutes

4 assignmentsTotal 33 minutes

Practice Assignment: Delta Lake Storage, Metadata, and Schema Management6 minutes
Practice Knowlede Check: Delta Table Types and Data Modification6 minutes
Time Travel and Streaming with Delta Lake6 minutes
Knowledge Check: Core Delta Lake Concepts and Operations15 minutes

This module focuses on designing scalable Lakehouse architectures using Medallion patterns and optimizing Delta Lake for performance and cost efficiency. Learners build multi-layer data pipelines and apply advanced optimization techniques to support BI and machine learning workloads.

What's included

13 videos5 readings4 assignments

13 videosTotal 70 minutes

Introduction to Medallion Architecture in the Lakehouse7 minutes
Layered Data Refinement Using Medallion Architecture6 minutes
Lakehouse Pipeline Architecture Design5 minutes
Demonstration: Building Bronze to Silver Pipelines using Medallion Architecture6 minutes
Demonstration: Transforming Data into Gold Tables5 minutes
Data Skipping, Statistics, and File Pruning6 minutes
Delta Table Performance Optimization7 minutes
Demonstration: File Compaction and Storage Optimization using OPTIMIZE4 minutes
Liquid Clustering Overview5 minutes
Demonstration: Validating the impact of Delta Lake optimizations4 minutes
Delta Lake Interoperability with BI Tools6 minutes
Using Delta Lake for Machine Learning Workloads5 minutes
Demonstration: Diagnosing the Small File Problem4 minutes

5 readingsTotal 75 minutes

Pipeline Orchestration and Dependency Management15 minutes
Lakeflow Connect on Databricks15 minutes
Balancing Cost, Latency, and Scalability15 minutes
Choosing Storage Layouts for Analytics, BI, and ML15 minutes
Module Summary: Lakehouse Architecture and Performance Optimization15 minutes

4 assignmentsTotal 33 minutes

Practice Assignment: Lakehouse Design Patterns6 minutes
Practice Knowledge Check: Delta Lake Performance Optimization Techniques6 minutes
Practice Knowledge Check: Storage Efficiency and Interoperability6 minutes
Knowledge Check: Lakehouse Architecture and Performance Optimization15 minutes

What's included

9 videos3 readings4 assignments

9 videosTotal 50 minutes

Concurrency Control and Optimistic Transactions7 minutes
Isolation Levels in Delta Lake6 minutes
Data Quality, Constraints, and Expectations6 minutes
Demonstration: Simulating Concurrent Writes in Delta Lake7 minutes
Demonstration: Applying Constraints and Validating Data in Delta Lake5 minutes
Introduction to Change Data Capture6 minutes
Demonstration: Using Delta Change Data Feed in Delta Lake5 minutes
Demonstration: Building Incremental CDC Pipelines using Delta Lake7 minutes
Course Summary3 minutes

3 readingsTotal 60 minutes

Failure Modes and Recovery in Distributed Data Systems15 minutes
Event-Driven Architectures and Incremental Processing15 minutes
Practice Project: Building an End-to-End Lakehouse on Databricks with Delta Lake30 minutes

4 assignmentsTotal 72 minutes

Practice Knowledge Check: Advanced Delta Lake Internals and Reliability6 minutes
Practice Knowledge Check: Change Data Capture with Delta Lake6 minutes
End Course Knowledge Check: Lakehouse Architecture and Delta Lake with Databricks30 minutes
Designing a Production-Ready Enterprise Lakehouse Modernization Strategy30 minutes

Instructor

Edureka

176 Courses157,508 learners

Offered by

Edureka

Explore more from Data Management

Status: Free Trial
Pragmatic AI Labs
Data Engineering with Delta Lake on Databricks
Course
Status: Free Trial
Pragmatic AI Labs
Databricks Lakehouse Fundamentals
Course
Packt
Building Modern Data Applications Using Databricks Lakehouse
Course
Status: Free Trial
Microsoft
Building Data Lakes and Lakehouses with Microsoft Fabric
Course

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Open new doors with Coursera Plus

Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Learn more

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Explore degrees

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Learn more

Frequently asked questions

Delta Lake is an open-source storage layer that brings ACID transactions, schema enforcement, and time travel to data lakes. This course teaches you to use Delta Lake natively on Databricks to build reliable, scalable data pipelines.

No. The course starts from the Databricks environment and workspace setup. Basic SQL and Python knowledge is recommended, but no prior Databricks or Delta Lake experience is required.

A Lakehouse combines the low-cost storage of a data lake with the reliability and performance of a data warehouse. You'll learn how Databricks implements this unified architecture using Delta Lake as its core.

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.