Tutorial Lengkap Terraform untuk ML: Infrastructure as Code untuk Machine Learning
Terraform adalah tool Infrastructure as Code (IaC) yang memungkinkan Anda melakukan provisioning dan mengelola cloud infrastructure secara deklaratif. Untuk tim ML, Terraform membantu mengotomatisasi deployment training clusters, inference endpoints, dan data pipelines.
Mengapa Terraform untuk ML?
Keunggulan Terraform:- Reproducibility: Infrastructure sama setiap waktu
- Version control: Track perubahan infrastructure
- Multi-cloud: Support AWS, GCP, Azure
- Modularity: Reusable infrastructure components
- State management: Track resource state
- ML training infrastructure
- Inference endpoint deployment
- Data pipeline provisioning
- Development environments
- Multi-region deployments
Instalasi
# macOS
brew tap hashicorp/tap
brew install hashicorp/tap/terraform
Linux
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsbrelease -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install terraform
Verify installation
terraform --version
Quick Start
1. Project Structure
ml-infrastructure/
├── main.tf # Main configuration
├── variables.tf # Input variables
├── outputs.tf # Output values
├── providers.tf # Provider configuration
├── terraform.tfvars # Variable values
└── modules/
├── training/
├── inference/
└── storage/
2. Basic Configuration
# providers.tf
terraform {
requiredproviders {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
requiredversion = ">= 1.0"
}
provider "aws" {
region = var.awsregion
}
# variables.tf
variable "awsregion" {
description = "AWS region"
type = string
default = "us-west-2"
}
variable "environment" {
description = "Environment name"
type = string
default = "dev"
}
variable "projectname" {
description = "Project name"
type = string
default = "ml-platform"
}
3. Basic Commands
# Initialize Terraform
terraform init
Format configuration
terraform fmt
Validate configuration
terraform validate
Plan changes
terraform plan
Apply changes
terraform apply
Destroy infrastructure
terraform destroy
S3 untuk ML Data
1. Data Lake Setup
# modules/storage/main.tf
resource "awss3bucket" "mldata" {
bucket = "${var.projectname}-${var.environment}-ml-data"
tags = {
Environment = var.environment
Project = var.projectname
Purpose = "ML Data Lake"
}
}
resource "awss3bucketversioning" "mldata" {
bucket = awss3bucket.mldata.id
versioningconfiguration {
status = "Enabled"
}
}
resource "awss3bucketlifecycleconfiguration" "mldata" {
bucket = awss3bucket.mldata.id
rule {
id = "archive-old-data"
status = "Enabled"
transition {
days = 90
storageclass = "STANDARDIA"
}
transition {
days = 180
storageclass = "GLACIER"
}
}
}
Folder structure
resource "awss3object" "rawdata" {
bucket = awss3bucket.mldata.id
key = "raw/"
contenttype = "application/x-directory"
}
resource "awss3object" "processeddata" {
bucket = awss3bucket.mldata.id
key = "processed/"
contenttype = "application/x-directory"
}