LiteLLM: Universal API Gateway for 100+ LLM Models

# LiteLLM: Universal API Gateway untuk 100+ Model LLM Dalam dunia AI yang berkembang pesat, kita dihadapkan dengan puluhan penyedia LLM (Large Language Model) seperti OpenAI, Anthropic, Google Gemini...

By Ruby Abdullah · · tutorial
LiteLLMLLMAPI GatewayOpenAIPython

LiteLLM: Universal API Gateway for 100+ LLM Models

In the rapidly evolving AI landscape, we face dozens of LLM (Large Language Model) providers such as OpenAI, Anthropic, Google Gemini, Cohere, Ollama, and many more. Each provider has different API formats, authentication methods, and parameters. Imagine if you could call all these models with a single, unified interface. That is exactly what LiteLLM offers.

LiteLLM is an open-source Python library that provides a unified interface to call 100+ LLM models using the OpenAI API format. With LiteLLM, you write your code once and can switch between providers without changing a single line of code.

Why LiteLLM?

Here are several reasons why LiteLLM has become a popular choice:

  • Unified API: One calling format for all providers
  • 100+ Models: Supports OpenAI, Anthropic, Google, Cohere, Ollama, HuggingFace, and more
  • OpenAI-Compatible Proxy: Run a proxy server compatible with the OpenAI API
  • Fallback & Retry: Automatically switch to another model if one fails
  • Cost Tracking: Track usage costs for each model
  • Load Balancing: Distribute requests across multiple models/providers
  • Streaming: Built-in streaming response support
  • Production Ready: Router for large-scale production deployments

Installation and Setup

Installing LiteLLM

Install LiteLLM using pip:

pip install litellm

For the proxy server feature, install with additional dependencies:

pip install 'litellm[proxy]'

Configuring API Keys

Before getting started, prepare your API keys from the providers you want to use. Store them as environment variables:

# OpenAI

export OPENAIAPIKEY="sk-your-openai-key"

Anthropic

export ANTHROPICAPIKEY="sk-ant-your-anthropic-key"

Google Gemini

export GEMINIAPIKEY="your-gemini-key"

Or use a .env file

You can also use a .env file and load it with python-dotenv:

from dotenv import loaddotenv

loaddotenv()

Basic Completion Calls

Calling OpenAI GPT

import litellm

Call OpenAI GPT-4o

response = litellm.completion(

model="gpt-4o",

messages=[

{"role": "system", "content": "You are a helpful assistant."},

{"role": "user", "content": "Explain what machine learning is in 2 sentences."}

]

)

print(response.choices[0].message.content)

Calling Anthropic Claude

# Call Anthropic Claude - just change the model name!

response = litellm.completion(

model="anthropic/claude-sonnet-4-20250514",

messages=[

{"role": "user", "content": "Explain what a neural network is in 2 sentences."}

]

)

print(response.choices[0].message.content)

Calling Google Gemini

# Call Google Gemini

response = litellm.completion(

model="gemini/gemini-2.0-flash",

messages=[

{"role": "user", "content": "What is the difference between supervised and unsupervised learning?"}

]

)

print(response.choices[0].message.content)

Calling Ollama (Local Models)

# Call a local model via Ollama

response = litellm.completion(

model="ollama/llama3",

messages=[

{"role": "user", "content": "Write a Python function to calculate fibonacci numbers."}

],

apibase="http://localhost:11434"

)

print(response.choices[0].message.content)

Unified Interface - Same Code, Different Providers

The key advantage of LiteLLM is that you can create a generic function that works with any provider:

import litellm

def askai(question: str, model: str = "gpt-4o") -> str:

"""Universal function to query any AI model."""

response = litellm.completion(

model=model,

messages=[

{"role": "user", "content": question}

],

temperature=0.7,

maxtokens=1000

)

Related Articles

TRL Tutorial: LLM Post-Training with SFT, DPO, and Reward Modeling

Post-Training LLM dengan TRL: SFT, Reward Modeling, dan DPO Setelah sebuah base language model selesai dipretraining, mo...

Axolotl Tutorial: Configuration-Driven LLM Fine-Tuning

Fine-Tuning LLM Berbasis Konfigurasi dengan Axolotl Kebanyakan proyek fine-tuning dimulai dengan cara yang sama: seseora...

PydanticAI Tutorial: A Type-Safe Agent Framework for LLM Apps

Membangun Agen LLM yang Type-Safe dengan PydanticAI PydanticAI adalah framework agen dari tim di balik Pydantic, diranca...

Unsloth Tutorial: Fast and Memory-Efficient LLM Fine-Tuning

Fine-Tuning LLM Secara Efisien dengan Unsloth Dahulu, melakukan fine-tuning model bahasa besar membutuhkan server multi-...