Tutorial Lengkap MLflow: Dari Setup hingga Production

Pendahuluan

MLflow adalah platform open-source untuk mengelola end-to-end machine learning lifecycle. Dikembangkan oleh Databricks, MLflow membantu data scientist dan ML engineer untuk tracking experiments, packaging code, managing models, dan deploying ke production.

Mengapa MLflow?

Reproducibility: Track semua experiment dengan detail
Collaboration: Share results dengan team
Model versioning: Kelola berbagai versi model
Deployment ready: Deploy model dengan mudah ke berbagai platform
Framework agnostic: Bekerja dengan TensorFlow, PyTorch, Scikit-learn, dll

Komponen Utama MLflow

MLflow terdiri dari 4 komponen utama:

MLflow Tracking: Record dan query experiments

MLflow Projects: Package ML code untuk reproducibility

MLflow Models: Deploy models ke berbagai platform

MLflow Registry: Centralized model store untuk versioning

Instalasi dan Setup

Instalasi Dasar

# Install MLflow pip install mlflow Install dengan extras untuk berbagai backend pip install mlflow[extras] Verify instalasi mlflow --version

Setup Database Backend (PostgreSQL)

Untuk production, gunakan database backend:

# Install dependencies
pip install psycopg2-binary

Setup PostgreSQL (contoh menggunakan Docker)
docker run -d \
  --name mlflow-db \
  -e POSTGRESUSER=mlflow \

  -e POSTGRESPASSWORD=mlflow \
  -e POSTGRESDB=mlflow \

  -p 5432:5432 \
  postgres:13

Setup Artifact Store (MinIO/S3)

# Install boto3 untuk S3 compatibility
pip install boto3

Setup MinIO (S3-compatible storage)
docker run -d \
  --name mlflow-minio \
  -p 9000:9000 \
  -p 9001:9001 \
  -e MINIOROOTUSER=minioadmin \

  -e MINIOROOTPASSWORD=minioadmin \

  minio/minio server /data --console-address ":9001"

Jalankan MLflow Server

# Development mode (local file store) mlflow server --host 0.0.0.0 --port 5000 Production mode (dengan database dan S3) mlflow server \ --backend-store-uri postgresql://mlflow:mlflow@localhost:5432/mlflow \ --default-artifact-root s3://mlflow-artifacts \ --host 0.0.0.0 \ --port 5000

Setup Environment Variables

Buat file .env:

# MLflow TrackingMLFLOWTRACKINGURI=http://localhost:5000 S3/MinIO Configuration AWSACCESSKEYID=minioadmin AWSSECRETACCESSKEY=minioadmin MLFLOWS3ENDPOINTURL=http://localhost:9000

MLflow Tracking: Experiment Tracking

Basic Tracking

import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import loadiris

from sklearn.modelselection import traintestsplit
from sklearn.metrics import accuracyscore, f1score

Set tracking URI
mlflow.settrackinguri("http://localhost:5000")

Set experiment
mlflow.setexperiment("iris-classification")


Load data
iris = loadiris()
Xtrain, Xtest, ytrain, ytest = traintestsplit(
    iris.data, iris.target, testsize=0.2, randomstate=42
)

Start MLflow run
with mlflow.startrun(runname="random-forest-v1") as run:
    # Log parameters
    params = {
        "nestimators": 100,

        "maxdepth": 5,
        "randomstate": 42

    }
    mlflow.logparams(params)

    # Train model
    model = RandomForestClassifier(params)

    model.fit(Xtrain, ytrain)

    # Make predictions
    ypred = model.predict(Xtest)

    # Log metrics
    metrics = {
        "accuracy": accuracyscore(ytest, ypred),

        "f1score": f1score(ytest, ypred, average="weighted")

    }
    mlflow.logmetrics(metrics)

    # Log model
    mlflow.sklearn.logmodel(

        model, 
        "model",

Tutorial Lengkap MLflow: Dari Setup hingga Production

Pendahuluan

Komponen Utama MLflow

Instalasi dan Setup

Instalasi Dasar

Install dengan extras untuk berbagai backend

Verify instalasi

Setup Database Backend (PostgreSQL)

Setup PostgreSQL (contoh menggunakan Docker)

Setup Artifact Store (MinIO/S3)

Setup MinIO (S3-compatible storage)

Jalankan MLflow Server

Production mode (dengan database dan S3)

Setup Environment Variables

S3/MinIO Configuration

MLflow Tracking: Experiment Tracking

Basic Tracking

Set tracking URI

Set experiment

Load data

Start MLflow run

Artikel Terkait

MLflow vs Neptune.ai: Panduan Lengkap Experiment Tracking untuk MLOps

Tutorial Lengkap Vertex AI: Platform ML Terpadu Google Cloud

Tutorial Integrasi Azure MLflow: Experiment Tracking di Azure

Tutorial Lengkap Azure Machine Learning: End-to-End ML Platform