Complete AWS Lambda + SageMaker Tutorial: Serverless ML Inference
Combining AWS Lambda with SageMaker enables serverless, scalable, and cost-effective ML inference. This tutorial covers patterns for integrating Lambda functions with SageMaker endpoints for production ML applications.
Why Lambda + SageMaker?
Key Benefits:- Serverless: No infrastructure management
- Cost-effective: Pay per invocation
- Scalable: Automatic scaling to demand
- Flexible: Multiple integration patterns
- Event-driven: Trigger inference from any AWS event
- Real-time API inference
- Batch processing triggers
- Event-driven ML pipelines
- Cost-optimized inference
- Multi-model serving
Prerequisites
pip install boto3 sagemaker
AWS CLI configured
aws configure
Pattern 1: Lambda Invoking SageMaker Endpoint
1. Deploy SageMaker Endpoint
import sagemaker
from sagemaker.sklearn import SKLearnModel
session = sagemaker.Session()
role = sagemaker.getexecutionrole()
Deploy model
model = SKLearnModel(
modeldata="s3://bucket/model/model.tar.gz",
role=role,
frameworkversion="1.0-1",
pyversion="py3"
)
predictor = model.deploy(
initialinstancecount=1,
instancetype="ml.m5.large",
endpointname="sklearn-endpoint"
)
2. Lambda Function Code
# lambdafunction.py
import json
import boto3
import os
Initialize SageMaker runtime client
sagemakerruntime = boto3.client("sagemaker-runtime")
ENDPOINTNAME = os.environ.get("SAGEMAKERENDPOINT", "sklearn-endpoint")
def lambdahandler(event, context):
"""Invoke SageMaker endpoint for inference."""
try:
# Parse input
if "body" in event:
body = json.loads(event["body"])
else:
body = event
features = body.get("features", [])
# Prepare payload
payload = json.dumps({"features": features})
# Invoke endpoint
response = sagemakerruntime.invokeendpoint(
EndpointName=ENDPOINTNAME,
ContentType="application/json",
Body=payload
)
# Parse response
result = json.loads(response["Body"].read().decode())
return {
"statusCode": 200,
"headers": {"Content-Type": "application/json"},
"body": json.dumps({
"prediction": result,
"endpoint": ENDPOINTNAME
})
}
except Exception as e:
return {
"statusCode": 500,
"body": json.dumps({"error": str(e)})
}
3. Deploy Lambda with SAM
# template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
InferenceFunction:
Type: AWS::Serverless::Function
Properties:
Handler: lambdafunction.lambdahandler
Runtime: python3.9
Timeout: 30
MemorySize: 256
Environment:
Variables:
SAGEMAKERENDPOINT: sklearn-endpoint
Policies:
- Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- sagemaker:InvokeEndpoint
Resource: '*'
Events:
Api:
Type: Api
Properties:
Path: /predict
Method: post
# Deploy
sam build
sam deploy --guided
Pattern 2: Serverless Inference with Lambda Container
1. Lambda Container with Model
# Dockerfile
FROM public.ecr.aws/lambda/python:3.9
Install dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt