Complete MLflow Tutorial: From Setup to Production

Introduction

MLflow is an open-source platform for managing the end-to-end machine learning lifecycle. Developed by Databricks, MLflow helps data scientists and ML engineers track experiments, package code, manage models, and deploy to production.

Why MLflow?

Reproducibility: Track all experiments in detail
Collaboration: Share results with team
Model versioning: Manage different model versions
Deployment ready: Easily deploy models to various platforms
Framework agnostic: Works with TensorFlow, PyTorch, Scikit-learn, etc.

MLflow Main Components

MLflow consists of 4 main components:

MLflow Tracking: Record and query experiments

MLflow Projects: Package ML code for reproducibility

MLflow Models: Deploy models to various platforms

MLflow Registry: Centralized model store for versioning

Installation and Setup

Basic Installation

# Install MLflow pip install mlflow Install with extras for various backends pip install mlflow[extras] Verify installation mlflow --version

Setup Database Backend (PostgreSQL)

For production, use a database backend:

# Install dependencies
pip install psycopg2-binary

Setup PostgreSQL (example using Docker)
docker run -d \
  --name mlflow-db \
  -e POSTGRESUSER=mlflow \

  -e POSTGRESPASSWORD=mlflow \
  -e POSTGRESDB=mlflow \

  -p 5432:5432 \
  postgres:13

Setup Artifact Store (MinIO/S3)

# Install boto3 for S3 compatibility
pip install boto3

Setup MinIO (S3-compatible storage)
docker run -d \
  --name mlflow-minio \
  -p 9000:9000 \
  -p 9001:9001 \
  -e MINIOROOTUSER=minioadmin \

  -e MINIOROOTPASSWORD=minioadmin \

  minio/minio server /data --console-address ":9001"

Run MLflow Server

# Development mode (local file store) mlflow server --host 0.0.0.0 --port 5000 Production mode (with database and S3) mlflow server \ --backend-store-uri postgresql://mlflow:mlflow@localhost:5432/mlflow \ --default-artifact-root s3://mlflow-artifacts \ --host 0.0.0.0 \ --port 5000

Setup Environment Variables

Create .env file:

# MLflow TrackingMLFLOWTRACKINGURI=http://localhost:5000 S3/MinIO Configuration AWSACCESSKEYID=minioadmin AWSSECRETACCESSKEY=minioadmin MLFLOWS3ENDPOINTURL=http://localhost:9000

MLflow Tracking: Experiment Tracking

Basic Tracking

import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import loadiris

from sklearn.modelselection import traintestsplit
from sklearn.metrics import accuracyscore, f1score

Set tracking URI
mlflow.settrackinguri("http://localhost:5000")

Set experiment
mlflow.setexperiment("iris-classification")


Load data
iris = loadiris()
Xtrain, Xtest, ytrain, ytest = traintestsplit(
    iris.data, iris.target, testsize=0.2, randomstate=42
)

Start MLflow run
with mlflow.startrun(runname="random-forest-v1") as run:
    # Log parameters
    params = {
        "nestimators": 100,

        "maxdepth": 5,
        "randomstate": 42

    }
    mlflow.logparams(params)

    # Train model
    model = RandomForestClassifier(params)

    model.fit(Xtrain, ytrain)

    # Make predictions
    ypred = model.predict(Xtest)

    # Log metrics
    metrics = {
        "accuracy": accuracyscore(ytest, ypred),

        "f1score": f1score(ytest, ypred, average="weighted")

    }
    mlflow.logmetrics(metrics)

    # Log model
    mlflow.sklearn.logmodel(

        model, 
        "model",
        registeredmodelname="iris-classifier"

    )

Complete MLflow Tutorial: From Setup to Production

Introduction

MLflow Main Components

Installation and Setup

Basic Installation

Install with extras for various backends

Verify installation

Setup Database Backend (PostgreSQL)

Setup PostgreSQL (example using Docker)

Setup Artifact Store (MinIO/S3)

Setup MinIO (S3-compatible storage)

Run MLflow Server

Production mode (with database and S3)

Setup Environment Variables

S3/MinIO Configuration

MLflow Tracking: Experiment Tracking

Basic Tracking

Set tracking URI

Set experiment

Load data

Start MLflow run

Related Articles

MLflow vs Neptune.ai: Complete Guide to Experiment Tracking for MLOps

Complete Vertex AI Tutorial: Google Cloud Unified ML Platform

Azure MLflow Integration Tutorial: Experiment Tracking on Azure

Complete Azure Machine Learning Tutorial: End-to-End ML Platform

Related Articles

MLflow vs Neptune.ai: Complete Guide to Experiment Tracking for MLOps

MLflow vs Neptune.ai: Panduan Lengkap Experiment Tracking untuk MLOps Experiment tracking adalah komponen krusial dalam ...

Complete Vertex AI Tutorial: Google Cloud Unified ML Platform

Tutorial Lengkap Vertex AI: Platform ML Terpadu di Google Cloud Vertex AI adalah platform machine learning terpadu Googl...

Azure MLflow Integration Tutorial: Experiment Tracking on Azure

Tutorial Lengkap Azure MLflow Integration: Experiment Tracking dan Model Management Azure Machine Learning menyediakan i...

Complete Azure Machine Learning Tutorial: End-to-End ML Platform

Tutorial Lengkap Azure Machine Learning: ML End-to-End di Azure Azure Machine Learning adalah platform berbasis cloud un...