Tutorial Lengkap Google Cloud Run untuk ML: Deployment ML Serverless
Google Cloud Run menyediakan platform serverless untuk mendeploy model ML yang dikontainerisasi. Menawarkan auto-scaling, harga pay-per-use, dan integrasi seamless dengan layanan Google Cloud.
Mengapa Cloud Run untuk ML?
Manfaat Utama:- Serverless: Tidak perlu manajemen infrastruktur
- Auto-scaling: Scale ke nol dan naik otomatis
- Hemat biaya: Bayar hanya untuk penggunaan aktual
- Berbasis container: Deploy framework apapun
- Deployment cepat: Deploy dalam hitungan detik
Prerequisites
pip install google-cloud-run flask gunicorn
gcloud auth login
gcloud config set project your-project-id
Quick Start
1. Buat ML Service
# app.py
from flask import Flask, request, jsonify
import joblib
import numpy as np
app = Flask(name)
Load model saat startup
model = joblib.load("model.joblib")
@app.route("/predict", methods=["POST"])
def predict():
data = request.getjson()
features = np.array(data["features"]).reshape(1, -1)
prediction = model.predict(features)
probability = model.predictproba(features)
return jsonify({
"prediction": int(prediction[0]),
"probability": probability[0].tolist()
})
@app.route("/health", methods=["GET"])
def health():
return jsonify({"status": "healthy"})
if name == "main":
app.run(host="0.0.0.0", port=8080)
2. Buat Dockerfile
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY model.joblib .
COPY app.py .
ENV PORT=8080
EXPOSE 8080
CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 app:app
3. requirements.txt
flask==2.3.0
gunicorn==21.2.0
joblib==1.3.0
scikit-learn==1.3.0
numpy==1.24.0
4. Deploy ke Cloud Run
# Build container
gcloud builds submit --tag gcr.io/your-project/ml-service
Deploy
gcloud run deploy ml-service \
--image gcr.io/your-project/ml-service \
--platform managed \
--region us-central1 \
--memory 2Gi \
--cpu 2 \
--min-instances 0 \
--max-instances 10 \
--allow-unauthenticated
FastAPI Service
1. Aplikasi FastAPI
# main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import joblib
import numpy as np
app = FastAPI(title="ML Prediction API")
Load model
model = joblib.load("model.joblib")
class PredictionRequest(BaseModel):
features: list[float]
class PredictionResponse(BaseModel):
prediction: int
probability: list[float]
@app.post("/predict", responsemodel=PredictionResponse)
async def predict(request: PredictionRequest):
try:
features = np.array(request.features).reshape(1, -1)
prediction = model.predict(features)
probability = model.predictproba(features)
return PredictionResponse(
prediction=int(prediction[0]),
probability=probability[0].tolist()
)
except Exception as e:
raise HTTPException(statuscode=400, detail=str(e))
@app.get("/health")
async def health():
return {"status": "healthy"}
2. FastAPI Dockerfile
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY model.joblib .
COPY main.py .
ENV PORT=8080
EXPOSE 8080
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]
PyTorch Service
1. PyTorch Inference
# servepytorch.py
from flask import Flask, request, jsonify
import torch
import torch.nn as nn
import numpy as np
app = Flask(name)
class SimpleNN(nn.Module):