Complete AWS SageMaker Tutorial: Machine Learning in the Cloud

# Tutorial Lengkap AWS SageMaker: End-to-End ML Pipeline Amazon SageMaker adalah layanan machine learning terkelola penuh yang memungkinkan data scientist dan developer membangun, melatih, dan deploy...

By Ruby Abdullah · · tutorial
AWSSageMakerMLOpsCloud MLPythonMachine Learning

Complete AWS SageMaker Tutorial: End-to-End ML Pipeline

Amazon SageMaker is a fully managed machine learning service that enables data scientists and developers to build, train, and deploy ML models at scale. This tutorial covers the complete ML lifecycle on AWS.

Why AWS SageMaker?

SageMaker Advantages:
  • Fully managed: No infrastructure to manage
  • End-to-end: Complete ML lifecycle support
  • Scalable: Train on any scale with managed infrastructure
  • Integrated: Native AWS service integration
  • Cost-effective: Pay only for what you use

Key Components:
  • SageMaker Studio (IDE)
  • SageMaker Training
  • SageMaker Inference
  • SageMaker Pipelines
  • SageMaker Feature Store
  • SageMaker Model Monitor

Prerequisites

# Install AWS CLI and SDK

pip install boto3 sagemaker pandas scikit-learn

Configure AWS credentials

aws configure

Enter: AWS Access Key ID, Secret Access Key, Region (e.g., us-east-1)

Quick Start

1. Setup SageMaker Session

import boto3

import sagemaker

from sagemaker import getexecutionrole

Create session

session = sagemaker.Session()

bucket = session.defaultbucket()

role = getexecutionrole() # Or specify IAM role ARN

print(f"Bucket: {bucket}")

print(f"Role: {role}")

print(f"Region: {session.botoregionname}")

2. Prepare Training Data

import pandas as pd

from sklearn.datasets import loadiris

from sklearn.modelselection import traintestsplit

Load sample data

iris = loadiris()

df = pd.DataFrame(iris.data, columns=iris.featurenames)

df['target'] = iris.target

Split data

traindf, testdf = traintestsplit(df, testsize=0.2, randomstate=42)

Save to S3

trainpath = f"s3://{bucket}/iris/train/train.csv"

testpath = f"s3://{bucket}/iris/test/test.csv"

traindf.tocsv(trainpath, index=False)

testdf.tocsv(testpath, index=False)

print(f"Training data: {trainpath}")

print(f"Test data: {testpath}")

Built-in Algorithms

1. XGBoost Training

from sagemaker.estimator import Estimator

from sagemaker.inputs import TrainingInput

Get XGBoost container

container = sagemaker.imageuris.retrieve(

framework="xgboost",

region=session.botoregionname,

version="1.5-1"

)

Create estimator

xgbestimator = Estimator(

imageuri=container,

role=role,

instancecount=1,

instancetype="ml.m5.xlarge",

outputpath=f"s3://{bucket}/iris/output",

sagemakersession=session,

hyperparameters={

"objective": "multi:softmax",

"numclass": 3,

"numround": 100,

"maxdepth": 5,

"eta": 0.2

}

)

Define training input

traininput = TrainingInput(

s3data=trainpath,

contenttype="csv"

)

Train model

xgbestimator.fit({"train": traininput})

2. Linear Learner

from sagemaker import LinearLearner

Create Linear Learner estimator

linear = LinearLearner(

role=role,

instancecount=1,

instancetype="ml.m5.large",

predictortype="multiclassclassifier",

numclasses=3,

outputpath=f"s3://{bucket}/linear/output"

)

Prepare data in RecordIO format

trainrecords = linear.recordset(

traindf.drop('target', axis=1).values.astype('float32'),

traindf['target'].values.astype('float32'),

channel='train'

)

Train

linear.fit(trainrecords)

Custom Training Scripts

1. Scikit-learn Training

# trainsklearn.py

import argparse

import joblib

import os

import pandas as pd

from sklearn.ensemble import RandomForestClassifier

from sklearn.metrics import accuracyscore

Related Articles

Complete Vertex AI Tutorial: Google Cloud Unified ML Platform

Tutorial Lengkap Vertex AI: Platform ML Terpadu di Google Cloud Vertex AI adalah platform machine learning terpadu Googl...

Complete Azure Machine Learning Tutorial: End-to-End ML Platform

Tutorial Lengkap Azure Machine Learning: ML End-to-End di Azure Azure Machine Learning adalah platform berbasis cloud un...

AWS SageMaker Model Monitor Tutorial: Production Model Monitoring

Tutorial Lengkap AWS SageMaker Model Monitor: Monitoring Model ML di Production Amazon SageMaker Model Monitor secara ot...

AWS SageMaker Feature Store Tutorial: Feature Management for ML

Tutorial Lengkap AWS SageMaker Feature Store: Manajemen Fitur untuk ML Amazon SageMaker Feature Store adalah repositori ...