CloudPath Academy

Your guide to AWS certification success

Amazon Web Services AWS Broken Labs

AWS Certified Machine Learning - Specialty (MLS-C01) Domain 1

Data Engineering

Official Exam Guide: Domain 1: Data Engineering

Skill Builder: AWS Certified Machine Learning - Specialty Exam Prep


Domain Overview

Domain 1 (20%) focuses on creating data repositories for ML, implementing data ingestion solutions, and implementing data transformation solutions.


Task 1.1: Create data repositories for ML

Key Concepts:

Essential Documentation:


Task 1.2: Identify and implement a data ingestion solution

Key Concepts:

Essential Documentation:


Task 1.3: Identify and implement a data transformation solution

Key Concepts:

Essential Documentation:


AWS Service FAQs


Study Tips

  1. Master data storage options - S3 for scalable object storage (training data, model artifacts), EFS for shared file systems, EBS for instance storage, databases for structured data.

  2. Learn streaming vs batch - Kinesis Data Streams for real-time ingestion, Firehose for delivery to S3/Redshift, Glue for batch ETL, EMR for large-scale processing.

  3. Understand ETL pipelines - AWS Glue for serverless ETL, EMR with Spark for complex transformations, Glue Data Catalog for metadata management.

  4. Practice data lake architecture - S3 as data lake storage, Glue crawlers for schema discovery, Athena for SQL queries, Lake Formation for governance.

  5. Study Apache Spark - DataFrame API, transformations vs actions, lazy evaluation, RDD operations, Spark SQL for ML data preparation.


Note: This is Domain 1 of 4, representing 20% of exam content.