Authentication
All API requests require a bearer token in the Authorization header. Create a key in your dashboard under API Keys → Create.
Authorization: Bearer YOUR_API_KEY
Security note: Never expose your API key in client-side code or commit it to version control. Use environment variables or a secrets manager.
POST
/v1/chat/completions
Creates a model response for the given conversation. Supports both synchronous and streaming modes.
Request body
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model ID to use. Use gabforge-coder for the primary model. |
| messages | array | Yes | Array of message objects. Each has a role (system, user, assistant) and a content string. |
| stream | boolean | No | If true, tokens are returned as server-sent events as they are generated. Default: false. |
| temperature | number | No | Sampling temperature between 0 and 2. Higher values produce more creative output. Default: 1.0. |
| max_tokens | integer | No | Maximum number of tokens to generate. If omitted, the model uses its default context limit. |
Response — non-streaming
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1741046400,
"model": "gabforge-coder",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 9,
"total_tokens": 21
}
}
Response — streaming (SSE)
Each chunk is a data: line followed by a JSON delta. The stream ends with data: [DONE].
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"},"index":0}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"!"},"index":0}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{},"finish_reason":"stop","index":0}]}
data: [DONE]
Example — cURL
curl https://ai.gabforge.ai/v1/chat/completions \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gabforge-coder", "messages": [{"role": "user", "content": "Explain async/await in Python."}] }'
Example — Python
from openai import OpenAI client = OpenAI( base_url="https://ai.gabforge.ai/v1", api_key="YOUR_API_KEY", ) response = client.chat.completions.create( model="gabforge-coder", messages=[ {"role": "user", "content": "Explain async/await in Python."}, ], temperature=0.7, max_tokens=512, ) print(response.choices[0].message.content)
GET
/v1/models
Returns the list of models available on this API. No request body required.
Example response
{
"object": "list",
"data": [
{
"id": "gabforge-coder",
"object": "model",
"created": 1741046400,
"owned_by": "gabforge"
},
{
"id": "claude-sonnet-4-6",
"object": "model",
"created": 1741046400,
"owned_by": "anthropic"
}
]
}
Example — cURL
curl https://ai.gabforge.ai/v1/models \
-H "Authorization: Bearer YOUR_API_KEY"
Error codes
All errors follow a standard shape. The error.message field contains a human-readable description.
{
"error": {
"message": "Invalid API key.",
"type": "authentication_error",
"code": 401
}
}
| HTTP | Error type | Description |
|---|---|---|
| 401 | authentication_error | Your API key is missing, malformed, or has been revoked. Check your Authorization header. |
| 429 | rate_limit_error | You have exceeded your plan's rate limit. The response includes a Retry-After header with the number of seconds to wait. |
| 503 | service_unavailable | The local inference engine and cloud fallback are both temporarily unavailable. Retry with exponential backoff. Check the status page for incidents. |