Skip to main content
POST
/
v1
/
chat
/
completions
Create chat completion
curl --request POST \
  --url https://api.edgee.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "openai/gpt-5.2",
  "messages": [
    {
      "role": "system",
      "content": "<string>",
      "name": "<string>",
      "tool_call_id": "<string>",
      "refusal": "<string>",
      "tool_calls": [
        {
          "id": "<string>",
          "type": "function",
          "function": {
            "name": "<string>",
            "arguments": "<string>"
          }
        }
      ],
      "cache_control": {
        "type": "ephemeral"
      }
    }
  ],
  "max_tokens": 2,
  "stream": false,
  "stream_options": {
    "include_usage": true
  },
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "<string>",
        "description": "<string>",
        "parameters": {}
      }
    }
  ],
  "tool_choice": "none",
  "edgee_tool_ids": [
    "edgee_current_time",
    "edgee_generate_uuid"
  ],
  "edgee_pending_id": "<string>",
  "tags": [
    "<string>"
  ],
  "enable_debug": true,
  "compression_model": "claude"
}
'
{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "openai/gpt-5.2",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 10,
    "total_tokens": 20,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens_details": {
      "reasoning_tokens": 0
    }
  },
  "compression": {
    "saved_tokens": 450,
    "cost_savings": 27000,
    "reduction": 48.99884991374353,
    "time_ms": 12
  }
}

Documentation Index

Fetch the complete documentation index at: https://www.edgee.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Creates a completion for the chat message. The Edgee API is OpenAI-compatible and works with any model and provider. Supports both streaming and non-streaming responses.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your API key. More info here

Headers

X-Edgee-Tags
string

Comma-separated list of tags for categorizing and filtering requests in analytics and logs. Example: production,chatbot,customer-support

X-Edgee-Debug
boolean

Enable debug mode to include additional debugging information in the response.

X-Edgee-Compression-Model
enum<string>

Compression bundle to apply. Tunes the compressor for the agentic style of the calling client.

Available options:
claude,
opencode,
cursor,
codex

Body

application/json
model
string
required

ID of the model to use. Format: {author_id}/{model_id} (e.g. openai/gpt-5.2)

Example:

"openai/gpt-5.2"

messages
object[]
required

A list of messages comprising the conversation so far.

Minimum array length: 1
max_tokens
integer

The maximum number of tokens that can be generated in the chat completion.

Required range: x >= 1
stream
boolean
default:false

If set, partial message deltas will be sent, as in OpenAI. Streamed chunks are sent as Server-Sent Events (SSE).

stream_options
object

Options for streaming response.

tools
object[]

A list of tools the model may call. Currently, only function type is supported.

tool_choice

Controls which tool (if any) the model is allowed to call. Accepts a bare string (none / auto), a typed-mode object ({ "type": "auto" | "none" }), or a specific function reference.

Available options:
none,
auto
edgee_tool_ids
string[]

List of Edge Tool IDs to inject (e.g. edgee_current_time, edgee_generate_uuid). Each ID must be activated for your API key. When omitted or empty, only tools with hydration enabled for your org or API key are auto-injected. Invalid or non-activated IDs return 400 with invalid_edgee_tool_ids.

Example:
["edgee_current_time", "edgee_generate_uuid"]
edgee_pending_id
string

Pending operation ID when continuing a conversation after Edge Tool execution (e.g. when mixing client-side and Edge Tools). The gateway injects stored Edge Tool results into the message history.

tags
string[]

Optional tags to categorize and label the request. Useful for filtering and grouping requests in analytics and logs. Can also be sent via the x-edgee-tags header as a comma-separated string.

enable_debug
boolean

When true, the response includes additional debug information. Equivalent to the X-Edgee-Debug header.

compression_model
enum<string>

Selects the compression bundle to apply to the request. Equivalent to the X-Edgee-Compression-Model header.

Available options:
claude,
opencode,
codex,
cursor

Response

Chat completion created successfully

id
string
required

A unique identifier for the chat completion.

Example:

"chatcmpl-123"

object
enum<string>
required

The object type, which is always chat.completion.

Available options:
chat.completion
created
integer
required

The Unix timestamp (in seconds) of when the chat completion was created.

Example:

1677652288

model
string
required

The model used for the chat completion.

Example:

"openai/gpt-5.2"

choices
object[]
required

A list of chat completion choices. Can be more than one if n is greater than 1.

usage
object
required

Usage statistics for the completion. In streaming responses, this is only present in the final chunk when stream_options.include_usage is true.

compression
object

Token compression metrics. Present in the response when token compression was applied to the request. The usage.prompt_tokens field reflects the compressed token count actually billed by the provider.

edgee_pending_id
string

Present when one or more Edge Tool calls were deferred. Pass this value back as edgee_pending_id in the next request to resume the conversation with the tool results filled in.

edgee_tools_executed
object[]

List of Edge Tools the gateway executed inline before returning. Empty or absent when no Edge Tools ran.