Tutorial Lengkap AWS Step Functions untuk ML: Orkestrasi ML Workflows
AWS Step Functions menyediakan orkestrasi workflow serverless untuk pipeline machine learning. Layanan ini memungkinkan Anda mengkoordinasi multiple layanan AWS, menangani error dengan baik, dan membangun workflow ML kompleks dengan monitoring visual.
Mengapa Step Functions untuk ML?
Manfaat Utama:- Workflow visual: Lihat eksekusi pipeline secara real-time
- Error handling: Built-in retry dan error recovery
- Integrasi layanan: Native AWS service connectors
- Serverless: Tidak perlu mengelola infrastruktur
- State management: Lacak state workflow secara otomatis
- ML training pipelines
- Data preprocessing workflows
- Automasi deployment model
- Orkestrasi batch inference
- Automasi MLOps
Prerequisites
pip install boto3 sagemaker
AWS CLI sudah dikonfigurasi
aws configure
Quick Start
1. Basic ML Workflow
{
"Comment": "Simple ML Training Pipeline",
"StartAt": "PreprocessData",
"States": {
"PreprocessData": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:preprocess",
"Next": "TrainModel"
},
"TrainModel": {
"Type": "Task",
"Resource": "arn:aws:states:::sagemaker:createTrainingJob.sync",
"Parameters": {
"TrainingJobName.$": "States.Format('training-{}', $.Execution.Name)",
"AlgorithmSpecification": {
"TrainingImage": "123456789.dkr.ecr.us-east-1.amazonaws.com/xgboost:latest",
"TrainingInputMode": "File"
},
"RoleArn": "arn:aws:iam::123456789:role/SageMakerRole",
"InputDataConfig": [
{
"ChannelName": "train",
"DataSource": {
"S3DataSource": {
"S3DataType": "S3Prefix",
"S3Uri.$": "$.traindatauri"
}
}
}
],
"OutputDataConfig": {
"S3OutputPath": "s3://bucket/output"
},
"ResourceConfig": {
"InstanceCount": 1,
"InstanceType": "ml.m5.xlarge",
"VolumeSizeInGB": 30
},
"StoppingCondition": {
"MaxRuntimeInSeconds": 3600
}
},
"Next": "EvaluateModel"
},
"EvaluateModel": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:evaluate",
"End": true
}
}
}
2. Deploy dengan CloudFormation
AWSTemplateFormatVersion: '2010-09-09'
Resources:
MLPipelineStateMachine:
Type: AWS::StepFunctions::StateMachine
Properties:
StateMachineName: ml-training-pipeline
RoleArn: !GetAtt StepFunctionsRole.Arn
DefinitionString: !Sub |
{
"StartAt": "PreprocessData",
"States": {
"PreprocessData": {
"Type": "Task",
"Resource": "${PreprocessFunction.Arn}",
"Next": "TrainModel"
},
"TrainModel": {
"Type": "Task",
"Resource": "arn:aws:states:::sagemaker:createTrainingJob.sync",
"Parameters": {
"TrainingJobName.$": "States.Format('job-{}', $.Execution.Name)"
},
"End": true
}
}
}
Integrasi SageMaker
1. Training Job
{
"TrainModel": {
"Type": "Task",
"Resource": "arn:aws:states:::sagemaker:createTrainingJob.sync",
"Parameters": {
"TrainingJobName.$": "States.Format('training-{}', $.Execution.Name)",
"AlgorithmSpecification": {
"TrainingImage": "683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-xgboost:1.5-1",
"TrainingInputMode": "File"
},
"RoleArn": "arn:aws:iam::123456789:role/SageMakerRole",
"InputDataConfig": [
{