Complete FastAPI for ML Tutorial: Build Production ML APIs
FastAPI is a modern, high-performance Python web framework for building APIs. With automatic OpenAPI documentation, type hints, and async support, FastAPI is the ideal choice for deploying machine learning models as production-ready APIs.
Why FastAPI for ML?
FastAPI Advantages:- High performance: On par with NodeJS and Go
- Type safety: Pydantic validation
- Auto documentation: Swagger UI and ReDoc
- Async support: Handle concurrent requests
- Easy testing: Built-in test client
- ML model serving APIs
- Real-time inference endpoints
- Batch prediction services
- Feature engineering APIs
- Model management systems
Installation
pip install fastapi uvicorn
With ML dependencies
pip install fastapi uvicorn scikit-learn joblib numpy pandas
For async database
pip install fastapi[all] sqlalchemy asyncpg
Verify installation
python -c "import fastapi; print(fastapi.version)"
Quick Start
1. Hello World API
# main.py
from fastapi import FastAPI
app = FastAPI(
title="ML API",
description="Machine Learning API with FastAPI",
version="1.0.0"
)
@app.get("/")
def readroot():
return {"message": "Welcome to ML API"}
@app.get("/health")
def healthcheck():
return {"status": "healthy"}
# Run server
uvicorn main:app --reload --host 0.0.0.0 --port 8000
Access docs at http://localhost:8000/docs
2. Simple ML Prediction Endpoint
from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import numpy as np
app = FastAPI()
Load model at startup
model = joblib.load("model.joblib")
class PredictionRequest(BaseModel):
features: list[float]
class PredictionResponse(BaseModel):
prediction: float
probability: list[float] | None = None
@app.post("/predict", responsemodel=PredictionResponse)
def predict(request: PredictionRequest):
features = np.array(request.features).reshape(1, -1)
prediction = model.predict(features)[0]
# Get probability if classifier
probability = None
if hasattr(model, "predictproba"):
probability = model.predictproba(features)[0].tolist()
return PredictionResponse(
prediction=float(prediction),
probability=probability
)
Request/Response Models
1. Pydantic Models
from pydantic import BaseModel, Field, validator
from typing import Optional, List
from enum import Enum
class ModelType(str, Enum):
classification = "classification"
regression = "regression"
class FeatureInput(BaseModel):
age: int = Field(..., ge=0, le=120, description="Age in years")
income: float = Field(..., gt=0, description="Annual income")
education: str = Field(..., description="Education level")
class Config:
schemaextra = {
"example": {
"age": 35,
"income": 75000.0,
"education": "bachelor"
}
}
class BatchPredictionRequest(BaseModel):
instances: List[FeatureInput]
modelversion: Optional[str] = "latest"
class PredictionResult(BaseModel):
prediction: float
confidence: float
modelversion: str
class BatchPredictionResponse(BaseModel):
predictions: List[PredictionResult]
processingtimems: float
2. Validation
from pydantic import BaseModel, validator, rootvalidator
class MLRequest(BaseModel):
features: List[float]
@validator('features')
def check
featurecount(cls, v):
if len(v) != 10:
raise ValueError('Must provide exactly 10 features')
return v
@validator('features', each
item=True)
def checkfeaturerange(cls, v):
if not -1 <= v <= 1: