Tutorial Lengkap Kubeflow: MLOps di Kubernetes

# Tutorial Lengkap Kubeflow: MLOps di Kubernetes Kubeflow adalah platform open-source untuk deploy, mengelola, dan scaling workflow machine learning di Kubernetes. Platform ini menyediakan solusi MLO...

By Ruby Abdullah · · tutorial
KubeflowKubernetesMLOpsML PipelineDistributed TrainingModel Serving

Tutorial Lengkap Kubeflow: MLOps di Kubernetes

Kubeflow adalah platform open-source untuk deploy, mengelola, dan scaling workflow machine learning di Kubernetes. Platform ini menyediakan solusi MLOps lengkap dengan pipelines, model serving, notebooks, dan experiment tracking.

Mengapa Kubeflow?

Keunggulan Kubeflow:
  • Kubernetes native: Manfaatkan skalabilitas dan reliabilitas K8s
  • End-to-end MLOps: Dari eksperimen hingga production
  • Portable: Jalankan di cluster Kubernetes manapun
  • Composable: Gunakan hanya komponen yang diperlukan
  • Open source: Komunitas aktif dan ekosistem luas

Use Cases:
  • Orkestrasi ML pipeline
  • Distributed training
  • Model serving skala besar
  • Experiment tracking
  • Feature engineering

Instalasi

1. Prerequisites

# Install kubectl

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"

chmod +x kubectl && sudo mv kubectl /usr/local/bin/

Install kustomize

curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/installkustomize.sh" | bash

sudo mv kustomize /usr/local/bin/

2. Install Kubeflow

# Clone manifests

git clone https://github.com/kubeflow/manifests.git

cd manifests

Install dengan kustomize

while ! kustomize build example | kubectl apply -f -; do

echo "Mencoba ulang..."

sleep 10

done

Cek instalasi

kubectl get pods -n kubeflow

3. Akses Dashboard

# Port forward

kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80

Akses di http://localhost:8080

Kredensial default: user@example.com / 12341234

Kubeflow Pipelines

1. Basic Pipeline

from kfp import dsl

from kfp import compiler

@dsl.component

def preprocessdata(datapath: str) -> str:

import pandas as pd

df = pd.readcsv(datapath)

df = df.dropna()

outputpath = "/tmp/preprocessed.csv"

df.tocsv(outputpath, index=False)

return outputpath

@dsl.component

def trainmodel(datapath: str, epochs: int) -> str:

import pickle

from sklearn.ensemble import RandomForestClassifier

import pandas as pd

df = pd.readcsv(datapath)

X = df.drop("target", axis=1)

y = df["target"]

model = RandomForestClassifier(nestimators=100)

model.fit(X, y)

modelpath = "/tmp/model.pkl"

with open(modelpath, "wb") as f:

pickle.dump(model, f)

return modelpath

@dsl.component

def evaluatemodel(modelpath: str, testdata: str) -> float:

import pickle

import pandas as pd

from sklearn.metrics import accuracyscore

with open(modelpath, "rb") as f:

model = pickle.load(f)

df = pd.readcsv(testdata)

X = df.drop("target", axis=1)

y = df["target"]

predictions = model.predict(X)

accuracy = accuracyscore(y, predictions)

return accuracy

@dsl.pipeline(name="ML Training Pipeline")

def mlpipeline(datapath: str, epochs: int = 10):

preprocesstask = preprocessdata(datapath=datapath)

traintask = trainmodel(

datapath=preprocesstask.output,

epochs=epochs

)

evaluatetask = evaluatemodel(

modelpath=traintask.output,

testdata=preprocesstask.output

)

Compile pipeline

compiler.Compiler().compile(mlpipeline, "pipeline.yaml")

2. Jalankan Pipeline

from kfp.client import Client

Koneksi ke Kubeflow

client = Client(host="http://localhost:8080/pipeline")

Buat experiment

experiment = client.createexperiment("my-experiment")

Jalankan pipeline

run = client.runpipeline(

experimentid=experiment.id,

jobname="training-run-1",

pipelinepackagepath="pipeline.yaml",

params={"datapath": "gs://bucket/data.csv", "epochs": 20}

)

Artikel Terkait

Tutorial KServe: Model Serving Serverless di Kubernetes

Serverless Model Serving di Kubernetes dengan KServe KServe adalah platform native Kubernetes untuk menyajikan model mac...

Tutorial Lengkap BentoML: Packaging dan Serving ML Models ke Production

Tutorial Lengkap BentoML: Packaging dan Serving ML Models ke Production BentoML adalah framework open-source untuk build...

Tutorial Ray Train & Ray Tune: Training Terdistribusi dan Tuning Hyperparameter

Ray Train & Ray Tune: Pelatihan Terdistribusi dan Penyetelan Hiperparameter Sebagian besar proyek machine learning dimul...

Tutorial Triton Inference Server: High-Performance Model Serving

Tutorial 19: Triton Inference Server - Penyajian Model Berperforma Tinggi Daftar Isi Pendahuluan Prasyarat Menyiapkan Tr...