MLflow vs Neptune.ai: Complete Guide to Experiment Tracking for MLOps

# MLflow vs Neptune.ai: Panduan Lengkap Experiment Tracking untuk MLOps Experiment tracking adalah komponen krusial dalam MLOps yang memungkinkan tim data science untuk melacak, membandingkan, dan me...

By Ruby Abdullah · · tutorial
MLOpsMLflowNeptune.aiExperiment TrackingMachine LearningPython

MLflow vs Neptune.ai: Complete Guide to Experiment Tracking for MLOps

Experiment tracking is a crucial component in MLOps that enables data science teams to track, compare, and reproduce machine learning experiments. In this tutorial, we'll compare two popular platforms: MLflow (open-source) and Neptune.ai (managed service), and learn how to use both.

Why is Experiment Tracking Important?

Without proper experiment tracking, ML teams often face:

  • Reproducibility crisis: Unable to reproduce previous experiment results
  • Lost experiments: Losing configurations that produced the best model
  • Collaboration issues: Difficult to share results across teams
  • Technical debt: Spreadsheets and manual notes that don't scale

Overview: MLflow vs Neptune.ai

| Aspect | MLflow | Neptune.ai |

|--------|--------|------------|

| Type | Open-source | Managed SaaS |

| Hosting | Self-hosted / Managed | Cloud-hosted |

| Pricing | Free (infra cost) | Free tier + paid plans |

| Setup | Manual setup | Instant |

| UI | Basic | Advanced |

| Collaboration | Limited | Built-in |

| Integrations | 15+ frameworks | 25+ frameworks |

| Model Registry | Yes | Yes |

| Best For | Full control, on-prem | Quick start, teams |

Part 1: MLflow

1.1 Installing MLflow

# Install MLflow

pip install mlflow

For tracking server with database backend

pip install mlflow[extras]

Start tracking server (local)

mlflow ui --port 5000

Or with backend store

mlflow server \

--backend-store-uri sqlite:///mlflow.db \

--default-artifact-root ./mlruns \

--host 0.0.0.0 \

--port 5000

1.2 Basic Experiment Tracking

import mlflow

import mlflow.sklearn

from sklearn.ensemble import RandomForestClassifier

from sklearn.datasets import loadiris

from sklearn.modelselection import traintestsplit

from sklearn.metrics import accuracyscore, f1score

Set tracking URI (optional, default: ./mlruns)

mlflow.settrackinguri("http://localhost:5000")

Set experiment name

mlflow.setexperiment("iris-classification")

Load data

X, y = loadiris(returnXy=True)

Xtrain, Xtest, ytrain, ytest = traintestsplit(X, y, testsize=0.2)

Start run

with mlflow.startrun(runname="random-forest-v1"):

# Log parameters

params = {

"nestimators": 100,

"maxdepth": 5,

"randomstate": 42

}

mlflow.logparams(params)

# Train model

model = RandomForestClassifier(*params)

model.fit(Xtrain, ytrain)

# Predict and evaluate

ypred = model.predict(Xtest)

accuracy = accuracyscore(ytest, ypred)

f1 = f1score(ytest, ypred, average='weighted')

# Log metrics

mlflow.logmetrics({

"accuracy": accuracy,

"f1score": f1

})

# Log model

mlflow.sklearn.logmodel(model, "model")

# Log artifacts (additional files)

with open("featureimportance.txt", "w") as f:

for name, importance in zip(loadiris().featurenames, model.featureimportances):

f.write(f"{name}: {importance:.4f}\n")

mlflow.logartifact("featureimportance.txt")

print(f"Run ID: {mlflow.activerun().info.runid}")

print(f"Accuracy: {accuracy:.4f}")

1.3 Hyperparameter Tuning with MLflow

import mlflow

from sklearn.ensemble import RandomForestClassifier

from sklearn.modelselection import crossvalscore

from sklearn.datasets import loadiris

import itertools

mlflow.setexperiment("iris-hyperparameter-tuning")

X, y = loadiris(returnXy=True)

Hyperparameter grid

paramgrid = {

"nestimators": [50, 100, 200],

"maxdepth": [3, 5, 10, None],

"minsamplessplit": [2, 5, 10]

}

Related Articles

Azure MLflow Integration Tutorial: Experiment Tracking on Azure

Tutorial Lengkap Azure MLflow Integration: Experiment Tracking dan Model Management Azure Machine Learning menyediakan i...

Complete Weights & Biases Tutorial: Experiment Tracking for Machine Learning

Tutorial Lengkap Weights & Biases: ML Experiment Tracking dan Visualization Weights & Biases (W&B) adalah platform MLOps...

Complete MLflow Tutorial: From Setup to Production

Pendahuluan MLflow adalah platform open-source untuk mengelola end-to-end machine learning lifecycle. Dikembangkan oleh ...

Complete Vertex AI Tutorial: Google Cloud Unified ML Platform

Tutorial Lengkap Vertex AI: Platform ML Terpadu di Google Cloud Vertex AI adalah platform machine learning terpadu Googl...