How to Setup Gunicorn on FastAPI for Concurrent Invocation

## Pendahuluan FastAPI adalah framework Python modern yang sangat cepat untuk membangun API. Namun, secara default, FastAPI development server (uvicorn) hanya menjalankan satu worker process, yang be...

By Ruby Abdullah · · tutorial
FastAPIGunicornPythonConcurrencyProduction

Introduction

FastAPI is a modern, fast Python framework for building APIs. However, by default, the FastAPI development server (uvicorn) only runs a single worker process, which means it can only handle one request at a time. For production and handling concurrent requests, we need Gunicorn as a process manager.

Gunicorn (Green Unicorn) is an HTTP server for Python that can run multiple worker processes, allowing our FastAPI application to handle many requests simultaneously.

Why Gunicorn?

Advantages of using Gunicorn:
  • Multiple worker processes for concurrent request handling
  • Automatic load balancing between workers
  • Graceful restart without downtime
  • Production-ready and battle-tested
  • Compatible with uvicorn workers for async support

Installation

First, install the required dependencies:

pip install fastapi uvicorn gunicorn

Or add to requirements.txt:

fastapi==0.109.0

uvicorn[standard]==0.27.0

gunicorn==21.2.0

Then install:

pip install -r requirements.txt

Creating a Simple FastAPI Application

Create a main.py file:

from fastapi import FastAPI

import time

import os

app = FastAPI()

@app.get("/")

async def root():

return {

"message": "Hello World",

"workerpid": os.getpid()

}

@app.get("/slow")

async def slowendpoint():

# Simulate time-consuming operation

time.sleep(5)

return {

"message": "This took 5 seconds",

"workerpid": os.getpid()

}

@app.get("/health")

async def healthcheck():

return {"status": "healthy"}

Setting Up Gunicorn with Uvicorn Workers

Gunicorn itself is a synchronous server. To get async benefits from FastAPI, we use Uvicorn workers.

Running from Command Line

gunicorn main:app --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000

Parameter explanation:
  • main:app - module:application (main.py with app variable)
  • --workers 4 - number of worker processes
  • --worker-class uvicorn.workers.UvicornWorker - use async uvicorn workers
  • --bind 0.0.0.0:8000 - host and port

Creating a Configuration File

Create a gunicornconf.py file for more structured configuration:

import multiprocessing

import os

Server Socket

bind = "0.0.0.0:8000"

backlog = 2048

Worker Processes

workers = int(os.getenv("WORKERS", multiprocessing.cpucount() 2 + 1))

workerclass = "uvicorn.workers.UvicornWorker"

workerconnections = 1000

maxrequests = 10000

maxrequestsjitter = 1000

timeout = 120

keepalive = 5

Logging

accesslog = "-"

errorlog = "-"

loglevel = "info"

accesslogformat = '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s" %(D)s'

Process Naming

procname = "fastapiapp"

Server Mechanics

daemon = False

pidfile = None

user = None

group = None

tmpuploaddir = None

Graceful Timeout

gracefultimeout = 30

Run with config file:

gunicorn main:app -c gunicornconf.py

Determining Optimal Number of Workers

General formula for determining number of workers:

workers = (2  CPUCORES) + 1

Examples:
  • 2 CPU cores = 5 workers
  • 4 CPU cores = 9 workers
  • 8 CPU cores = 17 workers

Tips:
  • For CPU-bound tasks: use the formula above
  • For I/O-bound tasks: can use more workers
  • Monitor memory usage, don't run out of RAM
  • Start with the standard formula, then adjust based on monitoring

Testing Concurrent Requests

Test whether concurrent invocation works correctly.

Using Python

Create a testconcurrent.py file:

import asyncio

import aiohttp

import time

Related Articles

Reflex Tutorial: Building Full-Stack Web Apps in Pure Python

Reflex: Membangun Aplikasi Web Full-Stack dengan Python Murni Reflex memungkinkan Anda membangun aplikasi web lengkap — ...

SQLModel: Modern Python ORM for Type-Safe AI Applications

SQLModel: ORM Modern Python untuk Aplikasi AI yang Type-Safe Dalam pengembangan aplikasi AI/ML, pengelolaan data di data...

Complete FastAPI for Machine Learning Tutorial: Building Production ML APIs

Tutorial Lengkap FastAPI untuk ML: Build Production ML APIs FastAPI adalah framework web Python modern dengan performa t...

Complete MongoDB Tutorial: NoSQL Database for Modern Applications

Tutorial Lengkap MongoDB: Database NoSQL untuk Aplikasi Modern MongoDB adalah database NoSQL document-oriented yang sang...