Codexpert AI - Unified AI API Gateway

Supported Providers & Models

Access the latest models from leading AI providers. We keep the model list up to date so you don't have to.

OpenAI

/v1/openai

gpt-4o Flagship
gpt-4o-mini Fast
gpt-4-turbo Turbo
gpt-4 Standard
gpt-3.5-turbo Economy
o1 Reasoning
o1-mini Compact reasoning
o3-mini Latest reasoning

Google Gemini

/v1/gemini

gemini-2.5-flash Latest
gemini-2.5-flash-preview
gemini-2.5-pro-preview
gemini-2.0-flash-lite Lite
gemini-1.5-pro Pro
gemini-1.5-flash Fast

Anthropic Claude

/v1/claude

claude-opus-4-6 Most capable
claude-sonnet-4-6 Balanced
claude-sonnet-4-5
claude-haiku-4-5 Fast

DeepSeek

/v1/deepseek

deepseek-chat General
deepseek-reasoner Reasoning

Developer Documentation

Everything you need to integrate Codexpert AI into your application.

Authentication

All API requests require an API-Key header. Include it in every request:

API-Key: cxai_a1b2c3d4e5f6...

Keys are tied to your account and can be restricted to specific providers. If the key is missing or invalid, you'll get a 401 response.

Base URL

All endpoints are relative to:

https://ai.codexpert.io/api/v1

Append the provider name to form the full URL. For example, the OpenAI endpoint is:

https://ai.codexpert.io/api/v1/openai

Request Format

Send a POST request with a JSON body. The body format must match the original provider's API specification. The model parameter selects which model to use. If omitted, the system default model for that provider is used.

{
  "model": "gpt-4o",                          // required (or use system default)
  "messages": [                                // provider-specific body
    {"role": "user", "content": "Hello!"}
  ]
}

Gemini uses a different body format (contents instead of messages). Refer to each provider's docs for their specific schema. Codexpert AI passes your request through as-is.

Endpoints

Method	Endpoint	Provider	Request Format
POST	`/v1/openai`	OpenAI	OpenAI Chat Completions
POST	`/v1/gemini`	Google Gemini	Gemini generateContent
POST	`/v1/claude`	Anthropic	Claude Messages
POST	`/v1/deepseek`	DeepSeek	OpenAI-compatible
POST	`/v1/base`	System default	Matches default provider

Response Format

Responses are returned exactly as the provider sends them. The HTTP status code matches the provider's response. A successful OpenAI call returns:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing uses quantum bits (qubits)..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 120,
    "total_tokens": 135
  }
}

Claude, Gemini, and DeepSeek each return their own native format. Codexpert AI never modifies the response body.

Error Codes

Code	Meaning	Common Cause
`200`	Success	Request completed normally
`400`	Bad Request	Missing `model`, invalid JSON, or provider rejected the request
`401`	Unauthorized	Missing or invalid `API-Key` header
`403`	Forbidden	Your API key is not authorized for the requested provider
`429`	Rate Limited	Too many requests per minute for your key tier
`500`	Server Error	Provider is down or internal error (failover may trigger)

Error Response Format

{
  "code": "rate_limit_exceeded",
  "message": "Rate limit exceeded. Please try again later.",
  "data": { "status": 429 }
}

Code Examples

Full request and response examples for every provider and popular languages.

cURL -- OpenAI / DeepSeek ▼

Request

curl -X POST https://ai.codexpert.io/api/v1/openai \
  -H "API-Key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    "temperature": 0.7,
    "max_tokens": 500
  }'

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Quantum computing uses quantum bits (qubits) that can exist in multiple states simultaneously..."
    },
    "finish_reason": "stop"
  }],
  "usage": { "prompt_tokens": 28, "completion_tokens": 145, "total_tokens": 173 }
}

DeepSeek uses the same format. Just change the endpoint to /v1/deepseek and the model to deepseek-chat.

cURL -- Anthropic Claude ▼

Request

curl -X POST https://ai.codexpert.io/api/v1/claude \
  -H "API-Key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Write a Python function to reverse a linked list."}
    ]
  }'

Response

{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "content": [{
    "type": "text",
    "text": "Here's a Python function to reverse a linked list:\n\n```python\ndef reverse_linked_list(head):\n    prev = None\n    current = head\n    while current:\n        next_node = current.next\n        current.next = prev\n        prev = current\n        current = next_node\n    return prev\n```"
  }],
  "usage": { "input_tokens": 18, "output_tokens": 95 }
}

Claude requires max_tokens in every request. The response uses input_tokens / output_tokens instead of OpenAI's naming.

cURL -- Google Gemini ▼

Request

curl -X POST https://ai.codexpert.io/api/v1/gemini \
  -H "API-Key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-flash",
    "contents": [{
      "parts": [{"text": "Summarize the history of the internet in 3 paragraphs."}]
    }]
  }'

Response

{
  "candidates": [{
    "content": {
      "parts": [{"text": "The internet originated in the late 1960s..."}],
      "role": "model"
    },
    "finishReason": "STOP"
  }],
  "usageMetadata": {
    "promptTokenCount": 14,
    "candidatesTokenCount": 280,
    "totalTokenCount": 294
  }
}

Gemini uses contents / parts instead of messages. The model is specified in the request body but handled via the URL internally.

JavaScript (fetch) ▼

const response = await fetch('https://ai.codexpert.io/api/v1/openai', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'API-Key': 'your_api_key'
  },
  body: JSON.stringify({
    model: 'gpt-4o',
    messages: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user', content: 'What is machine learning?' }
    ]
  })
});

const data = await response.json();

if (!response.ok) {
  console.error('Error:', data.message);
} else {
  console.log(data.choices[0].message.content);
}

Python (requests) ▼

import requests

response = requests.post(
    'https://ai.codexpert.io/api/v1/openai',
    headers={
        'Content-Type': 'application/json',
        'API-Key': 'your_api_key'
    },
    json={
        'model': 'gpt-4o',
        'messages': [
            {'role': 'user', 'content': 'Explain recursion with an analogy.'}
        ]
    }
)

data = response.json()

if response.ok:
    print(data['choices'][0]['message']['content'])
else:
    print(f"Error {response.status_code}: {data.get('message', 'Unknown error')}")

Python (OpenAI SDK -- drop-in) ▼

from openai import OpenAI

client = OpenAI(
    api_key="your_api_key",
    base_url="https://ai.codexpert.io/api/v1/openai",
    default_headers={"API-Key": "your_api_key"}
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Write a merge sort in Python."}
    ],
    temperature=0.3
)

print(response.choices[0].message.content)

Works with the official OpenAI Python SDK. Just override base_url and add the API-Key header.

PHP (WordPress) ▼

$response = wp_remote_post( 'https://ai.codexpert.io/api/v1/openai', array(
    'headers' => array(
        'Content-Type' => 'application/json',
        'API-Key'      => 'your_api_key',
    ),
    'body'    => wp_json_encode( array(
        'model'    => 'gpt-4o',
        'messages' => array(
            array( 'role' => 'user', 'content' => 'Hello from WordPress!' ),
        ),
    ) ),
    'timeout' => 60,
) );

if ( ! is_wp_error( $response ) ) {
    $body = json_decode( wp_remote_retrieve_body( $response ), true );
    echo $body['choices'][0]['message']['content'];
}

Migration Guide

Already using a provider directly? Switching to Codexpert AI takes two lines of change. Your request body and response handling stay identical.

From OpenAI Direct

Before (Direct OpenAI)

URL:    https://api.openai.com/v1/chat/completions
Header: Authorization: Bearer sk-proj-...
Body:   { "model": "gpt-4o", ... }   // unchanged

After (Codexpert AI)

URL:    https://ai.codexpert.io/api/v1/openai
Header: API-Key: cxai_...
Body:   { "model": "gpt-4o", ... }   // unchanged

From Claude Direct

Before (Direct Anthropic)

URL:    https://api.anthropic.com/v1/messages
Header: x-api-key: sk-ant-...
Header: anthropic-version: 2023-06-01
Body:   { "model": "claude-sonnet-4-6", ... }

After (Codexpert AI)

URL:    https://ai.codexpert.io/api/v1/claude
Header: API-Key: cxai_...
// No anthropic-version header needed
Body:   { "model": "claude-sonnet-4-6", ... }

From Gemini Direct

Before (Direct Gemini)

URL:    https://generativelanguage...
          /v1beta/models/gemini-2.5-flash
          :generateContent?key=AIza...
Body:   { "contents": [...] }

After (Codexpert AI)

URL:    https://ai.codexpert.io/api/v1/gemini
Header: API-Key: cxai_...
Body:   { "model": "gemini-2.5-flash",
          "contents": [...] }

Why Codexpert AI

Built for developers who want simplicity without compromising on features.

⚡

Drop-In Replacement

Same request and response format as native APIs. Change the URL and auth header -- your code, SDKs, and libraries work as-is.

🛠

Automatic Failover

If a provider returns a 5xx error, Codexpert AI automatically retries with a pre-configured fallback provider. Zero downtime for your users.

🔑

One API Key

No need to sign up for 4 different platforms and manage separate billing. One key, one bill, all models.

📊

Usage Analytics

Every request is logged with token counts (input/output), response times, status codes, and full request/response bodies for debugging.

🛡

Per-Key Rate Limits

Configure rate limits per API key. Give different clients different tiers. Rate limiting is enforced automatically per minute.

🔒

Provider Restrictions

Scope each API key to specific providers. A key for your mobile app might only access DeepSeek, while your backend gets everything.

Frequently Asked Questions

Quick answers to common questions about Codexpert AI.

Do I need accounts with each AI provider? ▼

No. Codexpert AI handles all provider credentials on the backend. You only need a Codexpert AI API key.

Does the API modify my requests or responses? ▼

No. Your request body is forwarded to the provider exactly as you send it. The provider's response is returned to you unmodified. Codexpert AI is a transparent pass-through.

Can I use existing SDKs (OpenAI Python, etc.)? ▼

Yes. Since the request/response formats are identical, you can use any existing SDK by overriding the base URL and adding the API-Key header. See the Python OpenAI SDK example above.

What happens if I don't specify a model? ▼

The system will use the default model configured for that provider. This is set by the administrator and can be changed at any time. If no default is configured, you'll receive a 400 error asking for a model.

How does automatic failover work? ▼

Each provider can have a fallback provider configured. If the primary provider returns a 5xx server error, Codexpert AI automatically retries the same request with the fallback. This is transparent -- you receive the fallback's response without any changes to your code.

What are the rate limits? ▼

Rate limits are configured per API key (requests per minute). When exceeded, you'll receive a 429 response. Wait briefly and retry. Contact us if you need higher limits.

Is my data logged? ▼

Request metadata (provider, model, token counts, duration, status code) is logged for usage tracking and billing. Request and response bodies are logged for debugging purposes and can be cleared at any time by the administrator.

Can I use this for production applications? ▼

Yes. Codexpert AI is designed for production use with automatic failover, rate limiting, usage logging, and encrypted credential storage. It runs on your own infrastructure, giving you full control.

One API. Every AI Model.

Supported Providers & Models

OpenAI

Google Gemini

Anthropic Claude

DeepSeek

How It Works

Get Your API Key

Pick a Provider & Model

Send Requests, Get Responses

Developer Documentation

Authentication

Base URL

Request Format

Endpoints

Response Format

Error Codes

Error Response Format

Code Examples

Migration Guide

From OpenAI Direct

From Claude Direct

From Gemini Direct

Why Codexpert AI

Drop-In Replacement

Automatic Failover

One API Key

Usage Analytics

Per-Key Rate Limits

Provider Restrictions

Frequently Asked Questions

Ready to Get Started?