Model Context Protocol (MCP)
The MCP layer handles model discovery, registration, and serving. Models are stored as GGUF, ONNX, or MLX files pinned to IPFS. The registry contract on-chain maps model IDs to IPFS CIDs and serving addresses. Any node can register a model. Consumers request inference through the registry. The serving node receives the inference fee automatically through the smart contract.
I built the MCP marketplace to be permissionless. You do not need approval to list a model. You do need a staked validator address to participate in the mentor-mentee protocol.
Permissionless Model Registry
Any participant can register an AI model on Citrate. There are no gatekeepers, no approval processes, and no centralized model marketplaces. The on-chain ModelRegistry (at precompile address 0x0100) stores model metadata, while MCP handles the off-chain routing and API translation.
Registration requires:
- A content-addressed model hash (IPFS CID or equivalent)
- An inference endpoint that conforms to the MCP API specification
- A CITE token stake bond (minimum 1,000 CITE)
- A declared capability set (text generation, embeddings, image generation, etc.)
Once registered, the model is immediately discoverable by any application on the network.
API-Compatible Endpoints
MCP exposes three core endpoints that mirror industry-standard AI APIs. Any application built against the OpenAI or Anthropic SDKs can point at a Citrate MCP gateway with minimal changes.
Model Discovery
GET /v1/models
Returns all registered models with their capabilities, pricing, and reputation scores. Supports filtering by capability, minimum reputation, and price range.
{
"data": [
{
"id": "citrate-llama-70b",
"object": "model",
"owned_by": "0xabc...def",
"capabilities": ["text-generation", "function-calling"],
"price_per_token": "0.00001",
"reputation_score": 0.97
}
]
}
Chat Completions
POST /v1/chat/completions
Routes chat completion requests to the best available model based on the caller's preferences (cost, latency, accuracy). Supports streaming responses.
{
"model": "citrate-llama-70b",
"messages": [
{"role": "user", "content": "Explain GhostDAG consensus."}
],
"stream": true
}
Embeddings
POST /v1/embeddings
Generates vector embeddings using registered embedding models. Compatible with applications that use OpenAI's embedding API.
{
"model": "citrate-embed-v1",
"input": "Paraconsistent consensus treats contradictions as learning signals."
}
Inference Routing
When a request arrives at an MCP gateway, the routing engine selects the optimal provider using a multi-factor scoring algorithm:
- Model match -- Does the provider serve the requested model (or a compatible variant)?
- Reputation -- What is the provider's historical accuracy and uptime, derived from blue score and attestation history?
- Latency -- What is the estimated round-trip time based on geographic proximity?
- Price -- Does the provider's fee fit within the caller's budget?
- Load -- Is the provider currently at capacity?
If the caller specifies a model ID, routing targets that specific model. If the caller specifies only a capability (e.g., "text-generation"), MCP selects the best provider automatically.
Fee Distribution
Inference fees flow through a transparent on-chain settlement:
| Recipient | Share | Purpose |
|---|---|---|
| Model provider | 70% | Compensation for compute |
| Network validators | 20% | Block production incentive |
| Protocol treasury | 10% | Ongoing development funding |
Fees are denominated in CITE and settled at each finality checkpoint. Providers can set their own per-token pricing, and the market determines which models attract usage. Our vision for the MCP marketplace is that it should be as open and competitive as possible -- anyone can serve, anyone can consume.
Attestation and Quality
Every inference response includes a cryptographic attestation that can be verified on-chain via the AttestationVerifier precompile. This creates an auditable trail of model performance and enables:
- Reputation tracking -- Nodes that consistently produce accurate results build higher reputation scores.
- Dispute resolution -- If a response is challenged, the attestation proves what model and input produced it.
- Quality guarantees -- Applications can require minimum reputation thresholds for their inference requests.
Further Reading
- AI Precompiles -- the on-chain contracts that underpin MCP
- LoRA Adapters -- how fine-tuned model variants are registered and served
- Finality Checkpoints -- when inference fees are settled