CloudPath Academy

Your guide to AWS certification success

Amazon Web Services AWS Broken Labs

AWS Certified Machine Learning - Specialty (MLS-C01) Domain 2

Exploratory Data Analysis

Official Exam Guide: Domain 2: Exploratory Data Analysis

Skill Builder: AWS Certified Machine Learning - Specialty Exam Prep


Domain Overview

Domain 2 (24%) focuses on sanitizing and preparing data, performing feature engineering, and analyzing and visualizing data for ML.


Task 2.1: Sanitize and prepare data for modeling

Key Concepts:

Essential Documentation:


Task 2.2: Perform feature engineering

Key Concepts:

Essential Documentation:


Task 2.3: Analyze and visualize data for ML

Key Concepts:

Essential Documentation:


AWS Service FAQs


Study Tips

  1. Master data cleaning - Handle missing values (imputation, deletion), remove outliers, handle duplicates, address class imbalance (SMOTE, undersampling).

  2. Learn feature engineering - Numeric: normalization, standardization, binning. Text: tokenization, TF-IDF, word embeddings. Categorical: one-hot encoding, label encoding.

  3. Understand dimensionality reduction - PCA for linear reduction, t-SNE for visualization, feature selection methods (filter, wrapper, embedded).

  4. Practice data visualization - Histograms for distributions, scatter plots for correlations, box plots for outliers, heatmaps for correlation matrices.

  5. Study descriptive statistics - Mean, median, mode, standard deviation, correlation coefficients, p-values for hypothesis testing.


Note: This is Domain 2 of 4, representing 24% of exam content.