Complete Kubeflow Tutorial: MLOps on Kubernetes
Kubeflow is an open-source platform for deploying, managing, and scaling machine learning workflows on Kubernetes. It provides a complete MLOps solution with pipelines, model serving, notebooks, and experiment tracking.
Why Kubeflow?
Kubeflow Advantages:- Kubernetes native: Leverage K8s scalability and reliability
- End-to-end MLOps: From experimentation to production
- Portable: Run on any Kubernetes cluster
- Composable: Use only the components you need
- Open source: Active community and ecosystem
- ML pipeline orchestration
- Distributed training
- Model serving at scale
- Experiment tracking
- Feature engineering
Installation
1. Prerequisites
# Install kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl && sudo mv kubectl /usr/local/bin/
Install kustomize
curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/installkustomize.sh" | bash
sudo mv kustomize /usr/local/bin/
2. Install Kubeflow
# Clone manifests
git clone https://github.com/kubeflow/manifests.git
cd manifests
Install with kustomize
while ! kustomize build example | kubectl apply -f -; do
echo "Retrying..."
sleep 10
done
Check installation
kubectl get pods -n kubeflow
3. Access Dashboard
# Port forward
kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80
Access at http://localhost:8080
Default credentials: user@example.com / 12341234
Kubeflow Pipelines
1. Basic Pipeline
from kfp import dsl
from kfp import compiler
@dsl.component
def preprocessdata(datapath: str) -> str:
import pandas as pd
df = pd.readcsv(datapath)
df = df.dropna()
outputpath = "/tmp/preprocessed.csv"
df.tocsv(outputpath, index=False)
return outputpath
@dsl.component
def trainmodel(datapath: str, epochs: int) -> str:
import pickle
from sklearn.ensemble import RandomForestClassifier
import pandas as pd
df = pd.readcsv(datapath)
X = df.drop("target", axis=1)
y = df["target"]
model = RandomForestClassifier(nestimators=100)
model.fit(X, y)
modelpath = "/tmp/model.pkl"
with open(modelpath, "wb") as f:
pickle.dump(model, f)
return modelpath
@dsl.component
def evaluatemodel(modelpath: str, testdata: str) -> float:
import pickle
import pandas as pd
from sklearn.metrics import accuracyscore
with open(modelpath, "rb") as f:
model = pickle.load(f)
df = pd.readcsv(testdata)
X = df.drop("target", axis=1)
y = df["target"]
predictions = model.predict(X)
accuracy = accuracyscore(y, predictions)
return accuracy
@dsl.pipeline(name="ML Training Pipeline")
def mlpipeline(datapath: str, epochs: int = 10):
preprocesstask = preprocessdata(datapath=datapath)
traintask = trainmodel(
datapath=preprocesstask.output,
epochs=epochs
)
evaluatetask = evaluatemodel(
modelpath=traintask.output,
testdata=preprocesstask.output
)
Compile pipeline
compiler.Compiler().compile(mlpipeline, "pipeline.yaml")
2. Run Pipeline
from kfp.client import Client
Connect to Kubeflow
client = Client(host="http://localhost:8080/pipeline")
Create experiment
experiment = client.createexperiment("my-experiment")
Run pipeline
run = client.runpipeline(
experimentid=experiment.id,
jobname="training-run-1",
pipelinepackagepath="pipeline.yaml",
params={"datapath": "gs://bucket/data.csv", "epochs": 20}
)
print(f"Run ID: {run.id}")