LiteLLM Gateway

LiteLLM is a proxy that presents a unified OpenAI-compatible API in front of 100+ LLM providers. Deploy it to abstract away provider-specific APIs and add centralized logging and rate limiting.

What this page covers

LiteLLM architecture and why to use it
Docker Compose deployment
Configuring providers (Anthropic, OpenAI, local llama.cpp)
Model routing and fallback configuration
API key management for downstream clients

Why use LiteLLM

With LiteLLM, your agents always talk to the same OpenAI-compatible endpoint (http://litellm:4000/v1). You can:

Add a new provider without touching agent code
Route claude-3-5-sonnet calls to Anthropic and llama3 calls to a local llama.cpp instance
See all LLM spending in one dashboard

Docker Compose deployment

services:
  litellm:
    image: ghcr.io/berriai/litellm:main-latest
    ports:
      - "4000:4000"
    volumes:
      - ./litellm-config.yaml:/app/config.yaml
    command: ["--config", "/app/config.yaml", "--port", "4000"]

Provider configuration

# litellm-config.yaml
model_list:
  - model_name: claude-sonnet
    litellm_params:
      model: anthropic/claude-sonnet-4-5
      api_key: os.environ/ANTHROPIC_API_KEY

  - model_name: llama-local
    litellm_params:
      model: openai/llama3
      api_base: http://llamacpp:8080/v1
      api_key: none

Deploy LiteLLM Gateway

Table of Contents

LiteLLM Gateway

What this page covers

Why use LiteLLM

Docker Compose deployment

Provider configuration

Related reference docs