Skip to main content
The Anthropic SDK provides official Python and TypeScript clients for interacting with Claude models. Edgee’s OpenAI-compatible API works seamlessly with the Anthropic SDK, allowing you to leverage the SDK’s features while gaining access to Edgee’s up to 50% cost reduction through token compression, unified gateway, automatic failover, and full observability.

Installation

pip install anthropic

Basic Usage

import os
from anthropic import Anthropic

# Initialize client with Edgee endpoint
client = Anthropic(
    base_url="https://api.edgee.ai",
    api_key=os.environ.get("EDGEE_API_KEY"),
)

# Send a message
message = client.messages.create(
    model="claude-sonnet-4.5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

print(message.content)

# Access token usage
print(f"Input tokens: {message.usage.input_tokens}")
print(f"Output tokens: {message.usage.output_tokens}")

Streaming Responses

Stream responses for real-time token delivery:
from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.edgee.ai",
    api_key=os.environ.get("EDGEE_API_KEY"),
)

# Stream messages
with client.messages.stream(
    model="claude-sonnet-4.5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Write a short poem about coding"}
    ]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Token Usage Tracking

Access standard Anthropic token usage metrics in every response:
from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.edgee.ai",
    api_key=os.environ.get("EDGEE_API_KEY"),
)

message = client.messages.create(
    model="claude-sonnet-4.5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Analyze this long document..."}]
)

print(message.content)
print(f"Input tokens: {message.usage.input_tokens}")
print(f"Output tokens: {message.usage.output_tokens}")
When compression is enabled, input_tokens reflects the compressed token count. View detailed compression metrics in the Edgee dashboard.

Compression & Tags via Headers

When using the Anthropic SDK with Edgee, you can control token compression and add tags using HTTP headers:

Enabling Compression

from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.edgee.ai",
    api_key=os.environ.get("EDGEE_API_KEY"),
    default_headers={
        "x-edgee-enable-compression": "true",
        "x-edgee-compression-rate": "0.8",  # Target 80% compression (0.0-1.0)
    }
)

# All requests will use compression with 80% target rate
message = client.messages.create(
    model="claude-sonnet-4.5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Analyze this document..."}]
)

Adding Tags for Analytics

Combine compression with tags to track requests in your dashboard:
from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.edgee.ai",
    api_key=os.environ.get("EDGEE_API_KEY"),
    default_headers={
        "x-edgee-enable-compression": "true",
        "x-edgee-compression-rate": "0.8",
        "x-edgee-tags": "production,anthropic-sdk,user-123"
    }
)
Available Headers:
HeaderTypeDescription
x-edgee-enable-compression"true" or "false"Enable token compression for requests (overrides console settings)
x-edgee-compression-ratestringTarget compression rate (0.0-1.0, default 0.75)
x-edgee-tagsstringComma-separated tags for analytics and filtering
You can also enable compression organization-wide or per API key in the Edgee console. Headers override console settings for specific requests.

Multi-Provider Access

With Edgee, you can access models from multiple providers using the same Anthropic SDK client and compare costs across providers:
from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.edgee.ai",
    api_key=os.environ.get("EDGEE_API_KEY"),
)

# Use Claude
claude_response = client.messages.create(
    model="claude-sonnet-4.5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)

# Use GPT-4 through the same client
gpt_response = client.messages.create(
    model="gpt-4o",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)

# Use Mistral
mistral_response = client.messages.create(
    model="mistral-large",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)

Function Calling (Tools)

Use Claude’s tool calling with Edgee:
from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.edgee.ai",
    api_key=os.environ.get("EDGEE_API_KEY"),
)

# Define a tool
tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather in a given location",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA"
                }
            },
            "required": ["location"]
        }
    }
]

# Send message with tools
message = client.messages.create(
    model="claude-sonnet-4.5",
    max_tokens=1024,
    tools=tools,
    messages=[
        {"role": "user", "content": "What's the weather like in Paris?"}
    ]
)

print(message.content)

Error Handling and Retries

The Anthropic SDK includes built-in retry logic, which works seamlessly with Edgee’s automatic failover:
from anthropic import Anthropic, APIError

client = Anthropic(
    base_url="https://api.edgee.ai",
    api_key=os.environ.get("EDGEE_API_KEY"),
    max_retries=3,  # SDK will retry up to 3 times
)

try:
    message = client.messages.create(
        model="claude-sonnet-4.5",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello!"}]
    )
    print(message.content)
except APIError as e:
    print(f"API Error: {e}")

Authentication

Edgee uses standard Bearer token authentication. Set your API key as an environment variable:
export EDGEE_API_KEY="sk-edgee-..."
Or in your .env file:
EDGEE_API_KEY=sk-edgee-...
The SDK automatically formats the authentication header as:
Authorization: Bearer {api_key}

Benefits of Using Anthropic SDK with Edgee

Up to 50% Cost Reduction

Automatic token compression on every request reduces input tokens by up to 50% while preserving output quality.

Multi-Provider Cost Comparison

Compare costs across Claude, GPT-4, Mistral, and 200+ models. Track compression savings per provider.

Automatic Failover

If Claude is rate-limited or unavailable, Edgee automatically routes to backup models without code changes.

Full Observability

Monitor latency, token usage, compression ratios, error rates, and costs for all requests in one dashboard.

Complete Example

Here’s a complete application example:
#!/usr/bin/env python3
import os
from anthropic import Anthropic

def main():
    # Initialize client
    client = Anthropic(
        base_url="https://api.edgee.ai",
        api_key=os.environ.get("EDGEE_API_KEY"),
        default_headers={
            "x-edgee-tags": "production,chat-app"
        }
    )

    # Chat loop
    conversation = []
    print("Chat with Claude (type 'quit' to exit)")

    while True:
        user_input = input("\nYou: ")
        if user_input.lower() == 'quit':
            break

        conversation.append({
            "role": "user",
            "content": user_input
        })

        # Stream response
        print("\nClaude: ", end="", flush=True)
        with client.messages.stream(
            model="claude-sonnet-4.5",
            max_tokens=1024,
            messages=conversation
        ) as stream:
            assistant_message = ""
            for text in stream.text_stream:
                print(text, end="", flush=True)
                assistant_message += text

        conversation.append({
            "role": "assistant",
            "content": assistant_message
        })

if __name__ == "__main__":
    main()

Next Steps