Skip to main content
Edgee is an AI Gateway that reduces LLM costs by up to 50% through intelligent token compression. If you want to save tokens for your coding agents, or if you want to optimize the contexts of your AI applications, Edgee is the solution for you.

Coding Agents: Get Started in Seconds with our CLI

1

Install the CLI

curl -fsSL https://edgee.ai/install.sh | bash
2

Launch your coding assistant

edgee launch claude
# That's it. Claude Code is now running with Edgee compression and full observability enabled.
That’s it. Your coding assistant is now running with Edgee compression and full observability enabled.

AI Applications: Get Started in Seconds with our SDKs

import Edgee from 'edgee';

const edgee = new Edgee("your-api-key");

const response = await edgee.send({
  model: 'gpt-5.2',
  input: 'What is the capital of France?',
});

console.log(response.text);
if (response.compression) {
  console.log(`Tokens saved: ${response.compression.saved_tokens}`);
}
That’s it. You now have access to every major LLM provider, automatic failovers, cost tracking, and full observability, all through Edgee’s Gateway. Edgee AI Gateway

3B+ Requests/Month

Up to 50% Input Token Reduction

100+ Global PoPs

Why Choose Edgee?

Building with LLMs is powerful, but comes with challenges:
  • Exploding AI costs: Token usage adds up fast, whether you’re running RAG pipelines, coding with Claude Code, or building multi-turn agents
  • Cost opacity: Bills spike with no visibility into what’s driving costs
  • Vendor lock-in: Your code is tightly coupled to a single provider’s API
  • No fallbacks: When OpenAI goes down, your app goes down
  • Security concerns: Sensitive data flows directly to third-party providers
  • Fragmented observability: Logs scattered across multiple dashboards
Edgee solves all of this with a single integration.

Core Capabilities

https://mintcdn.com/edgee/RmPUqoqJw-u0FxFP/images/icons/claude.svg?fit=max&auto=format&n=RmPUqoqJw-u0FxFP&q=85&s=d3154991b618d253ee22ffaf55a433fc

Token Compression for Coding Agents

Lossless compression for Claude Code, Codex, and OpenCode. Extend your session duration or cut API costs, with no code changes required.
https://mintcdn.com/edgee/RmPUqoqJw-u0FxFP/images/icons/agentic-comp.svg?fit=max&auto=format&n=RmPUqoqJw-u0FxFP&q=85&s=16ad50452d161326268839855fb35832

Token Compression for Agentic Workloads

AI-powered context optimization that reduces token usage. Perfect for long-context prompts and agentic workloads where context windows matter.

Cost & Observability

Real-time cost tracking, latency metrics, and request logs. Know exactly what your AI is doing and costing.

Unified API

One SDK, access to 200+ models from OpenAI, Anthropic, Google, Mistral, and more. Switch providers with a single line change.