Api reference

MCP REST API

We built the MCP REST API to be a drop-in replacement for the AI APIs you already use. The Model Context Protocol (MCP) REST API provides a standard HTTP interface for interacting with AI models deployed on the Citrate network. It offers compatibility with both OpenAI and Anthropic API formats, making it straightforward to integrate existing AI applications with on-chain models.

Base URL

http://localhost:8547

The MCP REST API runs on port 8547 by default (configurable via --mcp-port). For production deployments, use your gateway's public URL with TLS enabled.

Authentication

All requests require a Bearer token in the Authorization header:

Authorization: Bearer ctr_sk_live_abc123...

API keys are generated through the Citrate dashboard or via the citrate keys create CLI command. Keys are scoped to specific models or granted global access.

Rate Limits

ScopeLimit
Global1,000 req/min
Per model100 req/min
Per key500 req/min

Rate limit headers are included in every response:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 994
X-RateLimit-Reset: 1709145600

When a rate limit is exceeded, the API returns HTTP 429 with a Retry-After header.

Model Discovery

List Models

GET /v1/models

Returns a paginated list of all models available on the network.

Query Parameters:

ParameterTypeDefaultDescription
limitint20Number of models to return
offsetint0Pagination offset
formatstring...Filter by model format (onnx, gguf, safetensors)
ownerstring...Filter by owner address

Response:

{
  "object": "list",
  "data": [
    {
      "id": "model_0xabc123",
      "object": "model",
      "name": "sentiment-v1",
      "format": "onnx",
      "owner": "0xOwnerAddress",
      "created": 1709145600,
      "inference_count": 4521,
      "permissions": ["inference", "benchmark"]
    }
  ],
  "has_more": true,
  "total": 142
}

Get Model

GET /v1/models/:model_id

Returns detailed metadata for a single model.

Response:

{
  "id": "model_0xabc123",
  "object": "model",
  "name": "sentiment-v1",
  "format": "onnx",
  "owner": "0xOwnerAddress",
  "created": 1709145600,
  "inference_count": 4521,
  "model_hash": "0xdef456...",
  "storage_uri": "ipfs://Qm...",
  "input_schema": { "type": "string", "maxLength": 512 },
  "output_schema": { "type": "object" },
  "benchmark": {
    "latency_ms": 45,
    "throughput": 22,
    "accuracy_bps": 9420
  }
}

Inference

Chat Completions (OpenAI-compatible)

POST /v1/chat/completions

Runs inference using the OpenAI chat completions format. This endpoint is compatible with the OpenAI SDK and any tooling that targets the OpenAI API.

Request Body:

{
  "model": "model_0xabc123",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "What is GhostDAG consensus?" }
  ],
  "temperature": 0.7,
  "max_tokens": 256,
  "stream": false,
  "proof": true
}

Response:

{
  "id": "inf_xyz789",
  "object": "chat.completion",
  "created": 1709145700,
  "model": "model_0xabc123",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "GhostDAG is a consensus protocol that generalizes Nakamoto consensus..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 128,
    "total_tokens": 152,
    "gas_used": "0x7a120"
  },
  "proof": "0xzkproof..."
}

When stream: true, the response is delivered as Server-Sent Events (SSE) with data: prefixed JSON chunks.

Messages (Anthropic-compatible)

POST /v1/messages

Runs inference using the Anthropic messages format. Compatible with the Anthropic SDK.

Request Body:

{
  "model": "model_0xabc123",
  "max_tokens": 256,
  "messages": [
    { "role": "user", "content": "Explain the Medusa Paradigm." }
  ]
}

Response:

{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "content": [
    { "type": "text", "text": "The Medusa Paradigm describes a decentralized coordination model..." }
  ],
  "model": "model_0xabc123",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 12,
    "output_tokens": 98
  }
}

Embeddings

POST /v1/embeddings

Generates vector embeddings for the given input text.

Request Body:

{
  "model": "model_0xembed01",
  "input": "The Citrate blockchain uses GhostDAG consensus.",
  "encoding_format": "float"
}

Response:

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0023, -0.0091, 0.0152, "..."]
    }
  ],
  "model": "model_0xembed01",
  "usage": {
    "prompt_tokens": 9,
    "total_tokens": 9
  }
}

Async Jobs

For long-running inference tasks (batch processing, large model inference, benchmarking), use the async job API.

Create Job

POST /v1/jobs

Request Body:

{
  "model": "model_0xabc123",
  "type": "batch_inference",
  "inputs": [
    { "content": "Input text 1" },
    { "content": "Input text 2" },
    { "content": "Input text 3" }
  ],
  "webhook_url": "https://your-app.com/webhooks/citrate"
}

Response:

{
  "id": "job_def456",
  "object": "job",
  "status": "queued",
  "created": 1709145800,
  "model": "model_0xabc123",
  "type": "batch_inference",
  "input_count": 3
}

Get Job Status

GET /v1/jobs/:id

Response:

{
  "id": "job_def456",
  "object": "job",
  "status": "completed",
  "created": 1709145800,
  "completed_at": 1709145830,
  "model": "model_0xabc123",
  "type": "batch_inference",
  "input_count": 3,
  "results": [
    { "index": 0, "output": "...", "gas_used": "0x5208" },
    { "index": 1, "output": "...", "gas_used": "0x5208" },
    { "index": 2, "output": "...", "gas_used": "0x5208" }
  ]
}

Job statuses: queued, processing, completed, failed.

Error Responses

All errors follow a consistent format:

{
  "error": {
    "type": "invalid_request_error",
    "message": "Model model_0xinvalid not found.",
    "code": "model_not_found"
  }
}
HTTP StatusError TypeDescription
400invalid_request_errorMalformed or missing parameters
401authentication_errorInvalid or missing API key
403permission_errorAPI key lacks required permissions
404not_found_errorResource does not exist
429rate_limit_errorRate limit exceeded
500internal_errorServer-side error