Welcome to Edgee

Edgee is an Agent Gateway, the infrastructure layer between your coding agent (Claude Code, Codex, OpenCode, Cursor) and the LLM provider APIs (Anthropic, OpenAI, GLM, others). It applies three things to every request: compression of context, intelligent routing across providers, and observability of token consumption.

The three pillars

Compress

Surgical removal of redundancy from what enters and leaves the model. Two layers: Input (~99% of token volume) and Output (~1%, high ROI).

Route

Per-request fallback on provider 5xx and timeouts. Plan-cap continuity for Claude Pro/Max users when quota is hit. Configurable provider chain.

Observe

Every request, every compression event, every cost delta — at session level locally and at team level in the managed console.

Coding agents, start in seconds

Install the CLI

macOS / Linux
Homebrew (macOS)
Windows (PowerShell)

curl -fsSL https://install.edgee.ai | bash

brew install edgee-ai/tap/edgee

irm https://install.edgee.ai/install.ps1 | iex

Launch your coding agent

Claude Code
Codex
OpenCode

edgee launch claude

edgee launch codex

edgee launch opencode

Your coding agent now runs through Edgee with compression, routing, and metering active. The CLI prints a session-analytics link on exit.

Use the API directly

import Edgee from 'edgee';

const edgee = new Edgee("your-api-key");

const response = await edgee.send({
  model: 'gpt-5.2',
  input: 'What is the capital of France?',
});

console.log(response.text);
if (response.compression) {
  console.log(`Tokens saved: ${response.compression.saved_tokens}`);
}

from edgee import Edgee

edgee = Edgee("your-api-key")

response = edgee.send(
    model="gpt-5.2",
    input="What is the capital of France?"
)

print(response.text)
if response.compression:
    print(f"Tokens saved: {response.compression.saved_tokens}")

package main

import (
    "fmt"
    "log"
    "github.com/edgee-ai/go-sdk/edgee"
)

func main() {
    client, _ := edgee.NewClient("your-api-key")

    response, err := client.Send("gpt-5.2", "What is the capital of France?")
    if err != nil {
        log.Fatal(err)
    }

    fmt.Println(response.Text())
    if response.Compression != nil {
        fmt.Printf("Tokens saved: %d\n", response.Compression.SavedTokens)
    }
}

use edgee::Edgee;

let client = Edgee::with_api_key("your-api-key");
let response = client.send("gpt-5.2", "What is the capital of France?").await.unwrap();

println!("{}", response.text().unwrap_or(""));
if let Some(compression) = &response.compression {
    println!("Tokens saved: {}", compression.saved_tokens);
}

import OpenAI from "openai";

const openai = new OpenAI({
  baseURL: "https://api.edgee.ai/v1",
  apiKey: process.env.EDGEE_API_KEY,
});

const completion = await openai.chat.completions.create({
  model: "gpt-5.2",
  messages: [
    { role: "user", content: "What is the capital of France?" }
  ],
});

console.log(completion.choices[0].message.content);

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  baseURL: 'https://api.edgee.ai',
  apiKey: process.env.EDGEE_API_KEY,
});

const message = await client.messages.create({
  model: 'claude-sonnet-4.5',
  max_tokens: 1024,
  messages: [
    { role: 'user', content: 'What is the capital of France?' }
  ]
});

console.log(message.content);

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
import os

llm = ChatOpenAI(
    base_url="https://api.edgee.ai/v1",
    api_key=os.getenv("EDGEE_API_KEY"),
    model="gpt-5.2",
)

response = llm.invoke([HumanMessage(content="What is the capital of France?")])
print(response.content)

curl https://api.edgee.ai/v1/chat/completions \
  -H "Authorization: Bearer $EDGEE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-5.2","messages":[{"role":"user","content":"What is the capital of France?"}]}'

Edgee works with any OpenAI or Anthropic-compatible client by setting baseURL to https://api.edgee.ai.

The three pillars

Compress

Route

Observe

Coding agents, start in seconds

Use the API directly

Next steps

Why Edgee

Book a call

Documentation Index

​The three pillars

Compress

Route

Observe

​Coding agents, start in seconds

​Use the API directly

​Next steps

Why Edgee

Book a call

The three pillars

Coding agents, start in seconds

Use the API directly

Next steps