Complete AWS SageMaker Model Monitor Tutorial: ML Model Monitoring in Production
Amazon SageMaker Model Monitor automatically detects data quality issues, model quality degradation, bias drift, and feature attribution drift in ML models deployed to production. It helps maintain model performance over time.
Why Model Monitor?
Key Benefits:- Automated monitoring: Continuous model surveillance
- Drift detection: Data and model quality drift alerts
- Bias detection: Monitor fairness metrics
- Explainability: Feature attribution tracking
- Integration: Native SageMaker integration
- Data Quality Monitor
- Model Quality Monitor
- Bias Drift Monitor
- Feature Attribution Drift Monitor
Prerequisites
pip install sagemaker boto3 pandas numpy
SageMaker SDK >= 2.0
python -c "import sagemaker; print(sagemaker.version)"
Quick Start
1. Setup
import boto3
import sagemaker
from sagemaker import getexecutionrole
from sagemaker.modelmonitor import (
DefaultModelMonitor,
DataCaptureConfig,
CronExpressionGenerator
)
session = sagemaker.Session()
bucket = session.defaultbucket()
role = getexecutionrole()
region = session.botoregionname
Monitor output location
monitoroutput = f"s3://{bucket}/model-monitor"
2. Deploy Model with Data Capture
from sagemaker.model import Model
from sagemaker.predictor import Predictor
Create model
model = Model(
imageuri=xgboostimage,
modeldata=modeldatauri,
role=role
)
Data capture configuration
datacaptureconfig = DataCaptureConfig(
enablecapture=True,
samplingpercentage=100, # Capture all requests
destinations3uri=f"s3://{bucket}/data-capture",
captureoptions=["Input", "Output"],
csvcontenttypes=["text/csv"],
jsoncontenttypes=["application/json"]
)
Deploy with data capture
predictor = model.deploy(
initialinstancecount=1,
instancetype="ml.m5.large",
endpointname="monitored-endpoint",
datacaptureconfig=datacaptureconfig
)
print(f"Endpoint deployed: {predictor.endpointname}")
Data Quality Monitor
1. Create Baseline
from sagemaker.modelmonitor import DefaultModelMonitor
from sagemaker.model
monitor.datasetformat import DatasetFormat
Create monitor
data
qualitymonitor = DefaultModelMonitor(
role=role,
instance
count=1,
instancetype="ml.m5.xlarge",
volumesizeingb=20,
maxruntimeinseconds=3600
)
Create baseline from training data
dataqualitymonitor.suggestbaseline(
baselinedataset=f"s3://{bucket}/training-data/train.csv",
datasetformat=DatasetFormat.csv(header=True),
outputs3uri=f"{monitoroutput}/data-quality/baseline",
wait=True
)
print("Baseline created!")
2. View Baseline Statistics
import json
Get baseline statistics
baselinejob = dataqualitymonitor.latestbaseliningjob
statisticspath = f"{monitoroutput}/data-quality/baseline/statistics.json"
constraintspath = f"{monitoroutput}/data-quality/baseline/constraints.json"
Download and view statistics
s3 = boto3.client("s3")
Parse S3 URI
def parses3uri(uri):
parts = uri.replace("s3://", "").split("/", 1)
return parts[0], parts[1]
bucketname, key = parses3uri(statisticspath)
response = s3.getobject(Bucket=bucketname, Key=key)
statistics = json.loads(response["Body"].read())
print("Baseline Statistics:")
for feature in statistics["features"]:
print(f" {feature['name']}: mean={feature.get('numericalstatistics', {}).get('mean', 'N/A')}")