LiteLLM: Universal API Gateway for 100+ LLM Models

In the rapidly evolving AI landscape, we face dozens of LLM (Large Language Model) providers such as OpenAI, Anthropic, Google Gemini, Cohere, Ollama, and many more. Each provider has different API formats, authentication methods, and parameters. Imagine if you could call all these models with a single, unified interface. That is exactly what LiteLLM offers.

LiteLLM is an open-source Python library that provides a unified interface to call 100+ LLM models using the OpenAI API format. With LiteLLM, you write your code once and can switch between providers without changing a single line of code.

Why LiteLLM?

Here are several reasons why LiteLLM has become a popular choice:

Unified API: One calling format for all providers
100+ Models: Supports OpenAI, Anthropic, Google, Cohere, Ollama, HuggingFace, and more
OpenAI-Compatible Proxy: Run a proxy server compatible with the OpenAI API
Fallback & Retry: Automatically switch to another model if one fails
Cost Tracking: Track usage costs for each model
Load Balancing: Distribute requests across multiple models/providers
Streaming: Built-in streaming response support
Production Ready: Router for large-scale production deployments

Installation and Setup

Installing LiteLLM

Install LiteLLM using pip:

pip install litellm

For the proxy server feature, install with additional dependencies:

pip install 'litellm[proxy]'

Configuring API Keys

Before getting started, prepare your API keys from the providers you want to use. Store them as environment variables:

# OpenAI export OPENAIAPIKEY="sk-your-openai-key" Anthropic export ANTHROPICAPIKEY="sk-ant-your-anthropic-key" Google Gemini export GEMINIAPIKEY="your-gemini-key" Or use a .env file

You can also use a .env file and load it with python-dotenv:

from dotenv import loaddotenv
loaddotenv()

Basic Completion Calls

Calling OpenAI GPT

import litellm

Call OpenAI GPT-4o
response = litellm.completion(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain what machine learning is in 2 sentences."}
    ]
)

print(response.choices[0].message.content)

Calling Anthropic Claude

# Call Anthropic Claude - just change the model name!
response = litellm.completion(
    model="anthropic/claude-sonnet-4-20250514",
    messages=[
        {"role": "user", "content": "Explain what a neural network is in 2 sentences."}
    ]
)

print(response.choices[0].message.content)

Calling Google Gemini

# Call Google Gemini
response = litellm.completion(
    model="gemini/gemini-2.0-flash",
    messages=[
        {"role": "user", "content": "What is the difference between supervised and unsupervised learning?"}
    ]
)

print(response.choices[0].message.content)

Calling Ollama (Local Models)

# Call a local model via Ollama
response = litellm.completion(
    model="ollama/llama3",
    messages=[
        {"role": "user", "content": "Write a Python function to calculate fibonacci numbers."}
    ],
    apibase="http://localhost:11434"

)

print(response.choices[0].message.content)

Unified Interface - Same Code, Different Providers

The key advantage of LiteLLM is that you can create a generic function that works with any provider:

import litellm

def askai(question: str, model: str = "gpt-4o") -> str:
    """Universal function to query any AI model."""
    response = litellm.completion(
        model=model,
        messages=[
            {"role": "user", "content": question}
        ],
        temperature=0.7,
        maxtokens=1000

    )

LiteLLM: Universal API Gateway for 100+ LLM Models

LiteLLM: Universal API Gateway for 100+ LLM Models

Why LiteLLM?

Installation and Setup

Installing LiteLLM

Configuring API Keys

Anthropic

Google Gemini

Or use a .env file

Basic Completion Calls

Calling OpenAI GPT

Call OpenAI GPT-4o

Calling Anthropic Claude

Calling Google Gemini

Calling Ollama (Local Models)

Unified Interface - Same Code, Different Providers

Related Articles

TRL Tutorial: LLM Post-Training with SFT, DPO, and Reward Modeling

Axolotl Tutorial: Configuration-Driven LLM Fine-Tuning

PydanticAI Tutorial: A Type-Safe Agent Framework for LLM Apps

Unsloth Tutorial: Fast and Memory-Efficient LLM Fine-Tuning

Related Articles

TRL Tutorial: LLM Post-Training with SFT, DPO, and Reward Modeling

Post-Training LLM dengan TRL: SFT, Reward Modeling, dan DPO Setelah sebuah base language model selesai dipretraining, mo...

Axolotl Tutorial: Configuration-Driven LLM Fine-Tuning

Fine-Tuning LLM Berbasis Konfigurasi dengan Axolotl Kebanyakan proyek fine-tuning dimulai dengan cara yang sama: seseora...

PydanticAI Tutorial: A Type-Safe Agent Framework for LLM Apps

Membangun Agen LLM yang Type-Safe dengan PydanticAI PydanticAI adalah framework agen dari tim di balik Pydantic, diranca...

Unsloth Tutorial: Fast and Memory-Efficient LLM Fine-Tuning

Fine-Tuning LLM Secara Efisien dengan Unsloth Dahulu, melakukan fine-tuning model bahasa besar membutuhkan server multi-...