MLflow vs Neptune.ai: Panduan Lengkap Experiment Tracking untuk MLOps

# MLflow vs Neptune.ai: Panduan Lengkap Experiment Tracking untuk MLOps Experiment tracking adalah komponen krusial dalam MLOps yang memungkinkan tim data science untuk melacak, membandingkan, dan me...

By Ruby Abdullah · · tutorial
MLOpsMLflowNeptune.aiExperiment TrackingMachine LearningPython

MLflow vs Neptune.ai: Panduan Lengkap Experiment Tracking untuk MLOps

Experiment tracking adalah komponen krusial dalam MLOps yang memungkinkan tim data science untuk melacak, membandingkan, dan mereproduksi eksperimen machine learning. Dalam tutorial ini, kita akan membandingkan dua platform populer: MLflow (open-source) dan Neptune.ai (managed service), serta mempelajari cara menggunakan keduanya.

Mengapa Experiment Tracking Penting?

Tanpa experiment tracking yang proper, tim ML sering menghadapi:

  • Reproducibility crisis: Tidak bisa mereproduksi hasil eksperimen sebelumnya
  • Lost experiments: Kehilangan konfigurasi yang menghasilkan model terbaik
  • Collaboration issues: Sulit berbagi hasil antar tim
  • Technical debt: Spreadsheet dan catatan manual yang tidak scalable

Overview: MLflow vs Neptune.ai

| Aspek | MLflow | Neptune.ai |

|-------|--------|------------|

| Type | Open-source | Managed SaaS |

| Hosting | Self-hosted / Managed | Cloud-hosted |

| Pricing | Free (infra cost) | Free tier + paid plans |

| Setup | Manual setup | Instant |

| UI | Basic | Advanced |

| Collaboration | Limited | Built-in |

| Integrations | 15+ frameworks | 25+ frameworks |

| Model Registry | Yes | Yes |

| Best For | Full control, on-prem | Quick start, teams |

Bagian 1: MLflow

1.1 Instalasi MLflow

# Install MLflow

pip install mlflow

Untuk tracking server dengan database backend

pip install mlflow[extras]

Start tracking server (local)

mlflow ui --port 5000

Atau dengan backend store

mlflow server \

--backend-store-uri sqlite:///mlflow.db \

--default-artifact-root ./mlruns \

--host 0.0.0.0 \

--port 5000

1.2 Basic Experiment Tracking

import mlflow

import mlflow.sklearn

from sklearn.ensemble import RandomForestClassifier

from sklearn.datasets import loadiris

from sklearn.modelselection import traintestsplit

from sklearn.metrics import accuracyscore, f1score

Set tracking URI (optional, default: ./mlruns)

mlflow.settrackinguri("http://localhost:5000")

Set experiment name

mlflow.setexperiment("iris-classification")

Load data

X, y = loadiris(returnXy=True)

Xtrain, Xtest, ytrain, ytest = traintestsplit(X, y, testsize=0.2)

Start run

with mlflow.startrun(runname="random-forest-v1"):

# Log parameters

params = {

"nestimators": 100,

"maxdepth": 5,

"randomstate": 42

}

mlflow.logparams(params)

# Train model

model = RandomForestClassifier(*params)

model.fit(Xtrain, ytrain)

# Predict and evaluate

ypred = model.predict(Xtest)

accuracy = accuracyscore(ytest, ypred)

f1 = f1score(ytest, ypred, average='weighted')

# Log metrics

mlflow.logmetrics({

"accuracy": accuracy,

"f1score": f1

})

# Log model

mlflow.sklearn.logmodel(model, "model")

# Log artifacts (additional files)

with open("featureimportance.txt", "w") as f:

for name, importance in zip(loadiris().featurenames, model.featureimportances):

f.write(f"{name}: {importance:.4f}\n")

mlflow.logartifact("featureimportance.txt")

print(f"Run ID: {mlflow.activerun().info.runid}")

print(f"Accuracy: {accuracy:.4f}")

1.3 Hyperparameter Tuning dengan MLflow

import mlflow

from sklearn.ensemble import RandomForestClassifier

from sklearn.modelselection import crossvalscore

from sklearn.datasets import loadiris

import itertools

mlflow.setexperiment("iris-hyperparameter-tuning")

X, y = loadiris(returnXy=True)

Hyperparameter grid

paramgrid = {

"nestimators": [50, 100, 200],

"maxdepth": [3, 5, 10, None],

Artikel Terkait

Tutorial Integrasi Azure MLflow: Experiment Tracking di Azure

Tutorial Lengkap Azure MLflow Integration: Experiment Tracking dan Model Management Azure Machine Learning menyediakan i...

Tutorial Lengkap Weights & Biases: Experiment Tracking untuk Machine Learning

Tutorial Lengkap Weights & Biases: ML Experiment Tracking dan Visualization Weights & Biases (W&B) adalah platform MLOps...

Tutorial Lengkap MLflow: Dari Setup hingga Production

Pendahuluan MLflow adalah platform open-source untuk mengelola end-to-end machine learning lifecycle. Dikembangkan oleh ...

Tutorial Lengkap Vertex AI: Platform ML Terpadu Google Cloud

Tutorial Lengkap Vertex AI: Platform ML Terpadu di Google Cloud Vertex AI adalah platform machine learning terpadu Googl...