sagemaker-train-pipeline.ipynb: building a machine learning pipeline including data preprocessing, model training with hyperparameter tuning, model evaluation, and conditional model deployment based on performance metrics.
sagemaker-inference-pipeline.ipynb: creating an inference pipeline and using batch transformation capabilities.
- Store machine learning model source code and related files.
- Manage model training and inference images.
- Store raw data for preprocessing.
- Save transformed datasets post-processing.
- Use AWS CodeCommit and CodeBuild for pipeline automation components.
- Test machine learning models locally to ensure functionality before deployment.
- Employ Infrastructure as Code (IaC) to deploy necessary resources for machine learning workflows.
- Facilitate automated model training processes using SageMaker.
- Deploy trained models to various environments like staging, development, or production.
- Manage the transition of models between different environments.
- Trigger the pipeline by uploading new or updated datasets or updating source code in AWS CodeCommit.
- Implement AWS Lambda functions for initiating ETL processes and submitting AWS Glue jobs for data preprocessing and feature engineering.
- Monitor the model training process in SageMaker using AWS CloudWatch events.
- Incorporate training approval steps and conditional progression based on model training results.
- Deploy the trained model using AWS CloudFormation and create a SageMaker endpoint for model serving.
- Execute automated system tests using AWS Step Functions to evaluate model performance against predefined thresholds.
- Proceed with deployment to production, implementing autoscaling policies and data capture for quality monitoring.
- Complete the pipeline execution upon successful deployment to production.