ZimaDocs

Zima Labs API

API Documentation

An OpenAI-compatible API with zero data retention and hardware-level encryption. Drop in your existing SDK: change the base URL and key, nothing else.

Loading models...

Getting Started

Access state-of-the-art AI models through simple REST endpoints. Fast responses, per-token billing, production-ready infrastructure.

22+ Models

  • GPT-5, GPT-4o, o1/o3
  • Claude Sonnet, Opus, Haiku
  • DeepSeek, Qwen, Llama

Reasoning Models

  • o1, o3, o4, GPT-5
  • DeepSeek-R1 with reasoning_content
  • Chain-of-thought, clean answers

Enterprise Ready

  • Real-time token billing
  • Sub-second responses
  • Automatic failover

Authentication

All requests require your API key in the Authorization header as a Bearer token. Generate keys from your dashboard.

curl -X GET https://api.zimalabs.io/api/v1/usage \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --max-time 10

Security: Keep your API key confidential. Keys start with zima- and never expire unless rotated.

API Endpoints

Base URL: https://api.zimalabs.io/api/v1

POST/api/v1/chat/completions

Generate chat completions with any supported model.

curl -X POST https://api.zimalabs.io/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "MODEL_ID",
    "messages": [{"role": "user", "content": "Hello!"}],
    "max_tokens": 50,
    "temperature": 0.7
  }' \
  --max-time 10
GET/api/v1/models

List all available models, providers, and access requirements.

curl -X GET https://api.zimalabs.io/api/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --max-time 10
GET/api/v1/usage

Check your balance, token usage, and billing info.

curl -X GET https://api.zimalabs.io/api/v1/usage \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --max-time 10

Response Format

Responses follow the OpenAI format, extended with a billing object on every completion.

{
  "id": "chatcmpl-1234567890",
  "object": "chat.completion",
  "model": "MODEL_ID",
  "choices": [{"index": 0, "message": {"role": "assistant", "content": "..."}, "finish_reason": "stop"}],
  "usage": {"prompt_tokens": 12, "completion_tokens": 20, "total_tokens": 32},
  "billing": {"cost_usd": 0.0003, "remaining_credits": 10.9997}
}

Billing Fields

  • cost_usd Total cost for this request
  • remaining_credits Your balance after request
  • api_key_name Name of the key used

Usage Fields

  • prompt_tokens Input tokens consumed
  • completion_tokens Output tokens generated
  • total_tokens Sum of prompt + completion

Error Handling

Standard HTTP status codes with structured error messages.

{
  "error": {
    "type": "invalid_request_error",
    "message": "Model 'gpt-5' not found or inactive"
  }
}

Status Codes

  • 200: Success
  • 400: Bad Request
  • 401: Unauthorized
  • 402: Insufficient Credits
  • 408: Request Timeout (8s)
  • 500: Server Error

For reasoning models, the chain-of-thought is returned in message.reasoning_content. The clean answer is always in message.content. Zero code changes needed for existing clients.

Types & Parameters

Models fall into two categories. Reasoning models auto-exclude temperature and receive higher token limits.

Reasoning Models

Temperature excluded automatically. Higher token limits applied.

  • • GPT-5, o1, o3, o4 series
  • • DeepSeek-R1, gpt-oss-120b
  • • kimi-k2-thinking

Returns: reasoning_content (chain-of-thought) + content (clean answer)

Standard Models

Full parameter support: temperature, max_tokens, top_p.

  • • Claude Sonnet, Haiku, Opus
  • • GPT-4o, GPT-4o-mini
  • • DeepSeek-V3, Qwen, GLM

Returns: content only (no reasoning_content)

4th Wall Switching

An optional feature (off by default) that lets users switch models mid-conversation using natural language: “switch to GPT-5”, “use Claude instead”.

curl -X POST https://api.zimalabs.io/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Switch to Claude and write a poem"}],
    "enable4thWall": true,
    "announceSwitches": true,
    "max_tokens": 200
  }'
enable4thWall
  • Type: boolean
  • Default: false
  • Enables conversational model switching
announceSwitches
  • Type: boolean
  • Default: true
  • Notifies when a model switch occurs

Quick Testing

Ready-to-use commands to verify your integration. Replace YOUR_API_KEY with a key from your dashboard.

1. Check auth & balance

curl -X GET https://api.zimalabs.io/api/v1/usage \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --max-time 10

2. Reasoning model (GPT-5), no temperature

curl -X POST https://api.zimalabs.io/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5",
    "messages": [{"role": "user", "content": "What is 15 × 23?"}],
    "max_tokens": 200
  }'

3. Standard model (Claude Sonnet), full params

curl -X POST https://api.zimalabs.io/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-3.7",
    "messages": [{"role": "user", "content": "Write a short story about AI."}],
    "max_tokens": 150,
    "temperature": 0.8
  }'

4. DeepSeek-R1: inspect reasoning_content

curl -X POST https://api.zimalabs.io/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-r1",
    "messages": [{"role": "user", "content": "Explain quantum computing simply."}],
    "max_tokens": 300
  }'

5. List all models

curl -X GET https://api.zimalabs.io/api/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --max-time 10

Note: Reasoning models automatically exclude temperature and receive higher token limits. Their chain-of-thought is in reasoning_content; content is always the clean answer. Your existing code needs zero changes.

API Docs | Zima Labs