Google Cloud Run for ML Tutorial: Serverless ML Deployment

# Tutorial Lengkap Google Cloud Run untuk ML: Deployment ML Serverless Google Cloud Run menyediakan platform serverless untuk mendeploy model ML yang dikontainerisasi. Menawarkan auto-scaling, harga...

By Ruby Abdullah · · tutorial
GCPCloud RunServerlessML DeploymentDockerProduction

Complete Google Cloud Run for ML Tutorial: Serverless ML Deployment

Google Cloud Run provides a serverless platform for deploying containerized ML models. It offers auto-scaling, pay-per-use pricing, and seamless integration with Google Cloud services.

Why Cloud Run for ML?

Key Benefits:
  • Serverless: No infrastructure management
  • Auto-scaling: Scale to zero and up automatically
  • Cost-effective: Pay only for actual usage
  • Container-based: Deploy any framework
  • Fast deployment: Deploy in seconds

Prerequisites

pip install google-cloud-run flask gunicorn

gcloud auth login

gcloud config set project your-project-id

Quick Start

1. Create ML Service

# app.py

from flask import Flask, request, jsonify

import joblib

import numpy as np

app = Flask(name)

Load model on startup

model = joblib.load("model.joblib")

@app.route("/predict", methods=["POST"])

def predict():

data = request.getjson()

features = np.array(data["features"]).reshape(1, -1)

prediction = model.predict(features)

probability = model.predictproba(features)

return jsonify({

"prediction": int(prediction[0]),

"probability": probability[0].tolist()

})

@app.route("/health", methods=["GET"])

def health():

return jsonify({"status": "healthy"})

if name == "main":

app.run(host="0.0.0.0", port=8080)

2. Create Dockerfile

FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt

COPY model.joblib .

COPY app.py .

ENV PORT=8080

EXPOSE 8080

CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 app:app

3. requirements.txt

flask==2.3.0

gunicorn==21.2.0

joblib==1.3.0

scikit-learn==1.3.0

numpy==1.24.0

4. Deploy to Cloud Run

# Build container

gcloud builds submit --tag gcr.io/your-project/ml-service

Deploy

gcloud run deploy ml-service \

--image gcr.io/your-project/ml-service \

--platform managed \

--region us-central1 \

--memory 2Gi \

--cpu 2 \

--min-instances 0 \

--max-instances 10 \

--allow-unauthenticated

FastAPI Service

1. FastAPI Application

# main.py

from fastapi import FastAPI, HTTPException

from pydantic import BaseModel

import joblib

import numpy as np

app = FastAPI(title="ML Prediction API")

Load model

model = joblib.load("model.joblib")

class PredictionRequest(BaseModel):

features: list[float]

class PredictionResponse(BaseModel):

prediction: int

probability: list[float]

@app.post("/predict", responsemodel=PredictionResponse)

async def predict(request: PredictionRequest):

try:

features = np.array(request.features).reshape(1, -1)

prediction = model.predict(features)

probability = model.predictproba(features)

return PredictionResponse(

prediction=int(prediction[0]),

probability=probability[0].tolist()

)

except Exception as e:

raise HTTPException(statuscode=400, detail=str(e))

@app.get("/health")

async def health():

return {"status": "healthy"}

2. FastAPI Dockerfile

FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt

COPY model.joblib .

COPY main.py .

ENV PORT=8080

EXPOSE 8080

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]

PyTorch Service

1. PyTorch Inference

# servepytorch.py

from flask import Flask, request, jsonify

import torch

import torch.nn as nn

import numpy as np

app = Flask(name)

class SimpleNN(nn.Module):

Related Articles

Text Generation Inference (TGI) Tutorial: Production LLM Serving

Menyajikan LLM di Produksi dengan Text Generation Inference (TGI) Text Generation Inference (TGI) adalah toolkit buatan ...

Modal: Serverless GPU Cloud for ML Model Deployment

Modal: Serverless GPU Cloud untuk Deploy Model ML Salah satu tantangan terbesar dalam machine learning bukan membuat mod...

MLOps End-to-End Project Tutorial: From Data to Production

Tutorial 20: Proyek MLOps End-to-End Daftar Isi Pendahuluan Prasyarat Gambaran Proyek Versioning Data dengan DVC

Docker for Data Science & ML Tutorial: Model Containerization

Tutorial 15: Docker untuk Data Science dan Machine Learning Daftar Isi Pendahuluan Prasyarat Dasar-Dasar Docker untuk In...