MLflow vs Neptune.ai: Panduan Lengkap Experiment Tracking untuk MLOps

Experiment tracking adalah komponen krusial dalam MLOps yang memungkinkan tim data science untuk melacak, membandingkan, dan mereproduksi eksperimen machine learning. Dalam tutorial ini, kita akan membandingkan dua platform populer: MLflow (open-source) dan Neptune.ai (managed service), serta mempelajari cara menggunakan keduanya.

Mengapa Experiment Tracking Penting?

Tanpa experiment tracking yang proper, tim ML sering menghadapi:

Reproducibility crisis: Tidak bisa mereproduksi hasil eksperimen sebelumnya
Lost experiments: Kehilangan konfigurasi yang menghasilkan model terbaik
Collaboration issues: Sulit berbagi hasil antar tim
Technical debt: Spreadsheet dan catatan manual yang tidak scalable

Overview: MLflow vs Neptune.ai

| Aspek | MLflow | Neptune.ai |

|-------|--------|------------|

| Type | Open-source | Managed SaaS |

| Hosting | Self-hosted / Managed | Cloud-hosted |

| Pricing | Free (infra cost) | Free tier + paid plans |

| Setup | Manual setup | Instant |

| UI | Basic | Advanced |

| Collaboration | Limited | Built-in |

| Integrations | 15+ frameworks | 25+ frameworks |

| Model Registry | Yes | Yes |

| Best For | Full control, on-prem | Quick start, teams |

Bagian 1: MLflow

1.1 Instalasi MLflow

# Install MLflow pip install mlflow Untuk tracking server dengan database backend pip install mlflow[extras] Start tracking server (local) mlflow ui --port 5000 Atau dengan backend store mlflow server \ --backend-store-uri sqlite:///mlflow.db \ --default-artifact-root ./mlruns \ --host 0.0.0.0 \ --port 5000

1.2 Basic Experiment Tracking

import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import loadiris

from sklearn.modelselection import traintestsplit
from sklearn.metrics import accuracyscore, f1score

Set tracking URI (optional, default: ./mlruns)
mlflow.settrackinguri("http://localhost:5000")

Set experiment name
mlflow.setexperiment("iris-classification")


Load data
X, y = loadiris(returnXy=True)
Xtrain, Xtest, ytrain, ytest = traintestsplit(X, y, testsize=0.2)


Start run
with mlflow.startrun(runname="random-forest-v1"):

    # Log parameters
    params = {
        "nestimators": 100,
        "maxdepth": 5,

        "randomstate": 42
    }
    mlflow.logparams(params)


    # Train model
    model = RandomForestClassifier(*params)

    model.fit(Xtrain, ytrain)


    # Predict and evaluate
    ypred = model.predict(Xtest)

    accuracy = accuracyscore(ytest, ypred)
    f1 = f1score(ytest, ypred, average='weighted')


    # Log metrics
    mlflow.logmetrics({
        "accuracy": accuracy,
        "f1score": f1

    })

    # Log model
    mlflow.sklearn.logmodel(model, "model")

    # Log artifacts (additional files)
    with open("featureimportance.txt", "w") as f:

        for name, importance in zip(loadiris().featurenames, model.featureimportances):

            f.write(f"{name}: {importance:.4f}\n")
    mlflow.logartifact("featureimportance.txt")


    print(f"Run ID: {mlflow.activerun().info.runid}")

    print(f"Accuracy: {accuracy:.4f}")

1.3 Hyperparameter Tuning dengan MLflow

import mlflow from sklearn.ensemble import RandomForestClassifierfrom sklearn.modelselection import crossvalscore from sklearn.datasets import loadiris import itertools mlflow.setexperiment("iris-hyperparameter-tuning") X, y = loadiris(returnXy=True) Hyperparameter grid paramgrid = { "nestimators": [50, 100, 200], "maxdepth": [3, 5, 10, None],

MLflow vs Neptune.ai: Panduan Lengkap Experiment Tracking untuk MLOps

MLflow vs Neptune.ai: Panduan Lengkap Experiment Tracking untuk MLOps

Mengapa Experiment Tracking Penting?

Overview: MLflow vs Neptune.ai

Bagian 1: MLflow

1.1 Instalasi MLflow

Untuk tracking server dengan database backend

Start tracking server (local)

Atau dengan backend store

1.2 Basic Experiment Tracking

Set tracking URI (optional, default: ./mlruns)

Set experiment name

Load data

Start run

1.3 Hyperparameter Tuning dengan MLflow

Hyperparameter grid

Artikel Terkait

Tutorial Integrasi Azure MLflow: Experiment Tracking di Azure

Tutorial Lengkap Weights & Biases: Experiment Tracking untuk Machine Learning

Tutorial Lengkap MLflow: Dari Setup hingga Production

Tutorial Lengkap Vertex AI: Platform ML Terpadu Google Cloud