Tutorial Lengkap BentoML: Packaging dan Serving ML Models ke Production

BentoML adalah framework open-source untuk building, shipping, dan scaling AI applications. Dengan BentoML, Anda dapat mengubah model ML menjadi production-ready API services dengan mudah, lengkap dengan containerization, batching, dan monitoring.

Mengapa BentoML?

Tantangan dalam ML deployment:

Packaging complexity: Bundling model dengan dependencies
Serving infrastructure: Setup web server, API endpoints
Performance: Batching, caching, GPU utilization
Scalability: Horizontal scaling, load balancing
Multi-framework: Support berbagai ML frameworks

BentoML Solutions:

Unified API untuk semua ML frameworks
Auto-generated REST/gRPC APIs
Built-in adaptive batching
Docker/Kubernetes deployment ready
Model versioning dan management

Instalasi

# Install BentoML
pip install bentoml

Dengan framework-specific support
pip install "bentoml[pytorch]"
pip install "bentoml[tensorflow]"
pip install "bentoml[sklearn]"
pip install "bentoml[transformers]"

Verify installation
bentoml --version

Quick Start

1. Save Model ke BentoML

# trainandsave.py
import bentoml
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import loadiris


Train model
X, y = loadiris(returnXy=True)
model = RandomForestClassifier(nestimators=100)

model.fit(X, y)

Save model ke BentoML
savedmodel = bentoml.sklearn.savemodel(

    "irisclassifier",
    model,
    signatures={
        "predict": {"batchable": True, "batchdim": 0}

    },
    labels={"framework": "sklearn", "dataset": "iris"},
    metadata={"accuracy": 0.97}
)

print(f"Model saved: {savedmodel}")
Output: Model(tag="irisclassifier:abc123")

2. Buat Service

# service.py
import numpy as np
import bentoml
from bentoml.io import NumpyNdarray, JSON

Load model
irismodel = bentoml.sklearn.get("irisclassifier:latest")


Create runner
irisrunner = irismodel.torunner()

Create service
svc = bentoml.Service("irisservice", runners=[irisrunner])

Define API endpoint
@svc.api(input=NumpyNdarray(), output=NumpyNdarray())
async def predict(inputarray: np.ndarray) -> np.ndarray:

    return await irisrunner.predict.asyncrun(inputarray)

Alternative: JSON input/output
@svc.api(input=JSON(), output=JSON())
async def classify(inputdata: dict) -> dict:

    features = np.array(inputdata["features"]).reshape(1, -1)
    prediction = await irisrunner.predict.asyncrun(features)
    classnames = ["setosa", "versicolor", "virginica"]

    return {
        "prediction": int(prediction[0]),
        "classname": classnames[prediction[0]]

    }

3. Run Service Locally

# Development server bentoml serve service:svc --reload Production server bentoml serve service:svc --production Specify port bentoml serve service:svc --port 3000

4. Test Service

# testservice.py
import requests
import numpy as np

Test dengan NumpyNdarray
data = np.array([[5.1, 3.5, 1.4, 0.2]])
response = requests.post(
    "http://localhost:3000/predict",
    headers={"content-type": "application/json"},
    json=data.tolist()
)
print(f"Prediction: {response.json()}")

Test dengan JSON
response = requests.post(
    "http://localhost:3000/classify",
    json={"features": [5.1, 3.5, 1.4, 0.2]}
)
print(f"Classification: {response.json()}")

Building Bentos

1. Buat bentofile.yaml

# bentofile.yaml
service: "service:svc"
labels:
  owner: ml-team
  project: iris-classifier
include:
".py"

python:
  packages:
scikit-learn
numpy
docker:
  distro: debian
  pythonversion: "3.10"

Tutorial Lengkap BentoML: Packaging dan Serving ML Models ke Production

Tutorial Lengkap BentoML: Packaging dan Serving ML Models ke Production

Mengapa BentoML?

Instalasi

Dengan framework-specific support

Verify installation

Quick Start

1. Save Model ke BentoML

Train model

Save model ke BentoML

Output: Model(tag="irisclassifier:abc123")

2. Buat Service

Load model

Create runner

Create service

Define API endpoint

Alternative: JSON input/output

3. Run Service Locally

Production server

Specify port

4. Test Service

Test dengan NumpyNdarray

Test dengan JSON

Building Bentos

1. Buat bentofile.yaml

Artikel Terkait

Tutorial KServe: Model Serving Serverless di Kubernetes

Tutorial Lengkap Kubeflow: MLOps di Kubernetes

Tutorial Lengkap Ray Serve: Scalable ML Model Serving

Tutorial Text Generation Inference (TGI): Serving LLM untuk Produksi