Yoke Agent REST API Docs — RAG & Agent Evaluation Endpoints

Introduction

What you're looking at.

Yoke Agent ships a FastAPI backend that exposes every piece of the platform over REST. Dashboards, CLI and MCP clients all talk through the same surface documented here — there is no private API.

Responses are JSON, dates are ISO 8601, identifiers are integer primary keys. Long-running operations (dataset generation, grid-search runs, simulation sweeps) are kicked off by an endpoint and then observed via the experiment-status / transcripts endpoints — never blocking the HTTP round-trip.

Install & run

One command to get the stack up.

Clone the repository, then bring up the backend, frontend, worker and ChromaDB in one shot. Everything runs locally — no data leaves your machine.

Terminal

$ git clone https://github.com/Empreiteiro/yoke-agent.git
$ cd yoke-agent
$ make dev
# → dashboard  http://localhost:3000
# → REST API   http://localhost:4040
# → Swagger UI http://localhost:4040/docs
# → ReDoc      http://localhost:4040/redoc

Base URL

Everything hangs off `/api`.

In the default compose setup the API is served at http://localhost:4040. In production the base URL is whatever you front it with. All routes below are relative to that host.

Tip

The Swagger UI at /docs and ReDoc at /redoc are generated live from the same schemas this page documents — keep them in view while you wire clients.

Authentication

JWT, with API keys as a shortcut.

Most endpoints need a bearer token. Two ways to get one:

Login flow — POST email + password to /api/users/login and use the returned access_token.
API keys — create one under /api/auth/keys and send it as a bearer. Handy for CI and server-to-server scripts.

Login & use

$ TOKEN=$(curl -s http://localhost:4040/api/users/login \
    -H "Content-Type: application/json" \
    -d '{"email":"you@example.com","password":"..."}' | jq -r .access_token)

$ curl -H "Authorization: Bearer $TOKEN" \
    http://localhost:4040/api/users/me

Errors

Standard HTTP, JSON payload.

Non-2xx responses carry a detail field (string or object) as the body. 401 means the token is missing or expired; 403 means you're authenticated but don't own the resource; 404 resources silently scope to the caller — you won't see other users' IDs.

Platform

Auth

API-key bootstrap mode and per-user API keys.

GET/api/auth/statusReturn API-key bootstrap mode and current count.

GET/api/auth/keysList API keys for the caller (admins see all).

POST/api/auth/keysCreate a new API key.

DEL/api/auth/keys/{key_id}Delete an API key by id.

Platform

Users

Registration, login and audit trail. First user created becomes admin.

POST/api/users/registerCreate an account. First caller is promoted to admin.

POST/api/users/loginExchange email and password for a JWT.

GET/api/users/meReturn the currently authenticated user.

GET/api/users/List all users (admin only).

PATCH/api/users/{user_id}/roleChange a user's role (admin only).

GET/api/users/audit-logsPaginated audit log (admin only).

Platform

MCP

Tools the MCP server exposes, and the toggle to disable individual ones.

GET/api/mcp/toolsList MCP tools and their enabled status.

PUT/api/mcp/toolsUpdate the MCP tool disabled-list.

GET/api/mcp/connection-infoGet the MCP server path and connection details.

Platform

Providers

LLM and embedding providers — OpenAI, Anthropic, Gemini, Cohere, Azure, Ollama, HuggingFace, Claude Code CLI, custom OpenAI-compatible.

GET/api/providers/catalogSupported provider types and their models.

GET/api/providers/ollama/discoverDiscover models on a running Ollama server.

GET/api/providers/List configured providers.

POST/api/providers/Create a provider.

GET/api/providers/{provider_id}Get a provider by id.

PUT/api/providers/{provider_id}Update a provider.

DEL/api/providers/{provider_id}Delete a provider.

POST/api/providers/{provider_id}/testTest provider connectivity.

GET/api/providers/models/availableAggregate models across active providers.

GET/api/providers/models/embedding-dimensionsEmbedding-dimension capabilities per model.

GET/api/providers/claude-code/statusCheck the Claude Code CLI install status.

POST/api/providers/claude-code/authSave the Claude Code OAuth token.

Platform

Vector stores

ChromaDB, Pinecone, Qdrant, pgvector and Astra DB. Managed as configurations you can test-connect before running experiments.

GET/api/vectorstores/catalogSupported vector-store types.

GET/api/vectorstores/configsList vector-store configurations.

POST/api/vectorstores/configsCreate a configuration.

PUT/api/vectorstores/configs/{config_id}Update a configuration.

DEL/api/vectorstores/configs/{config_id}Delete a configuration.

POST/api/vectorstores/configs/{config_id}/testTest connectivity for a configuration.

GET/api/vectorstores/availableChromaDB plus every active configured store.

Platform

Token usage

Every LLM and embedding call logs token counts and estimated USD. Flex-tier discounts are detected automatically.

GET/api/token-usage/List usage rows with filters.

GET/api/token-usage/summaryAggregated summary — totals, per-provider, per-day.

Platform

Traces

OpenTelemetry traces captured during runs. Pipe out via OTLP for Grafana, Honeycomb or Datadog.

GET/api/traces/List traces with optional filters.

GET/api/traces/summaryAggregated trace statistics.

GET/api/traces/{trace_id}Trace detail with span tree.

DEL/api/traces/{trace_id}Delete a trace.

Platform

Chat

A streaming SSE endpoint backed by your default provider — the assistant the dashboard ships with.

POST/api/chat/streamStream chat deltas via Server-Sent Events.

RAG workbench

Documents

Upload and manage the source documents that feed evaluation datasets and RAG pipelines.

POST/api/documents/uploadUpload PDF, DOCX, TXT or Markdown files.

GET/api/documents/List documents scoped to the caller.

GET/api/documents/{document_id}Get a document by id.

DEL/api/documents/{document_id}Delete a document.

RAG workbench

Datasets

Evaluation Q&A sets. Generate with an LLM from your docs, import curated JSON, edit, review and lock with approve.

POST/api/datasets/Create an evaluation dataset.

GET/api/datasets/List datasets — benchmarks are shared, rest are user-scoped.

GET/api/datasets/{dataset_id}Get a dataset by id.

DEL/api/datasets/{dataset_id}Delete a dataset.

POST/api/datasets/bulk-deleteDelete many datasets at once.

POST/api/datasets/seed-benchmarksSeed the built-in benchmark datasets.

POST/api/datasets/{dataset_id}/generateGenerate Q&A pairs with an LLM (background).

POST/api/datasets/{dataset_id}/itemsAdd a Q&A item manually.

PUT/api/datasets/{dataset_id}/items/{item_id}Update a dataset item.

DEL/api/datasets/{dataset_id}/items/{item_id}Delete a dataset item.

POST/api/datasets/{dataset_id}/importImport items from a JSON file.

POST/api/datasets/{dataset_id}/duplicateDuplicate a dataset including all items.

POST/api/datasets/{dataset_id}/approveLock a dataset as approved for experiments.

RAG workbench

Experiments

Grid-search runs. Propose a grid with an LLM, estimate cost, configure combinations, run, and read the RAGAS-scored report.

POST/api/experiments/Create an experiment.

GET/api/experiments/List experiments for the caller.

GET/api/experiments/{experiment_id}Get an experiment by id.

DEL/api/experiments/{experiment_id}Delete an experiment.

POST/api/experiments/bulk-deleteDelete many experiments at once.

POST/api/experiments/{experiment_id}/propose-gridLLM-proposed grid configuration.

POST/api/experiments/{experiment_id}/estimate-costEstimate tokens and USD for a proposed grid.

POST/api/experiments/{experiment_id}/configure-gridMaterialise combinations from the grid config.

POST/api/experiments/{experiment_id}/runStart execution (background).

POST/api/experiments/{experiment_id}/stopStop a running experiment.

POST/api/experiments/{experiment_id}/resumeResume a stopped or failed experiment.

GET/api/experiments/{experiment_id}/reportGet the results report.

GET/api/experiments/{experiment_id}/exportExport results as CSV.

GET/api/experiments/custom-metrics/registeredList custom metrics registered via the hook.

GET/api/experiments/{experiment_id}/failures/exportExport failure analysis as CSV.

GET/api/experiments/combinations/{combination_id}/ratingsList human ratings for a combination.

POST/api/experiments/combinations/{combination_id}/ratingsCreate or update a rating for (combination, question).

PATCH/api/experiments/combinations/{combination_id}/ratings/{rating_id}Update a rating.

DEL/api/experiments/combinations/{combination_id}/ratings/{rating_id}Delete a rating.

GET/api/experiments/{experiment_id}/correlationPearson & Spearman between ratings and metrics.

GET/api/experiments/{experiment_id}/export-ratedExport rated questions with RAGAS scores.

RAG workbench

Collections

Saved winning collections — the vector stores materialised from an experiment's best configuration, ready to query.

GET/api/collections/List saved collections with sizes.

DEL/api/collections/{collection_id}Delete a saved collection.

POST/api/collections/{collection_id}/queryQuery a collection and synthesise an answer.

POST/api/collections/save-from-experiment/{experiment_id}Persist the winning config from an experiment.

Agent workbench

Agent configs

Agent definitions — system prompt, tools, MCP servers, guardrails — plus prompt variants, scenarios and personas.

GET/api/agent/configs/List agent configurations.

POST/api/agent/configs/Create an agent configuration.

GET/api/agent/configs/{config_id}Get config by id.

PUT/api/agent/configs/{config_id}Update a configuration.

DEL/api/agent/configs/{config_id}Delete a configuration.

POST/api/agent/configs/test-httpDry-run test for an external HTTP agent.

POST/api/agent/configs/{config_id}/test-httpTest the HTTP endpoint bound to a saved config.

POST/api/agent/configs/enhance/promptLLM-assisted system-prompt rewrite.

POST/api/agent/configs/enhance/descriptionLLM-assisted description rewrite.

POST/api/agent/configs/enhance/suggest-toolsSuggest tools to attach to the agent.

GET/api/agent/configs/{config_id}/variantsList prompt variants.

POST/api/agent/configs/{config_id}/variantsCreate a prompt variant.

DEL/api/agent/configs/{config_id}/variants/{variant_id}Delete a variant.

POST/api/agent/configs/{config_id}/variants/generateGenerate variants with an LLM.

GET/api/agent/configs/{config_id}/scenariosList scenarios.

POST/api/agent/configs/{config_id}/scenariosCreate a scenario.

POST/api/agent/configs/{config_id}/scenarios/bulkCreate many scenarios at once.

PUT/api/agent/configs/{config_id}/scenarios/{scenario_id}Update a scenario.

DEL/api/agent/configs/{config_id}/scenarios/{scenario_id}Delete a scenario.

POST/api/agent/configs/{config_id}/scenarios/generateGenerate scenarios with an LLM.

GET/api/agent/configs/{config_id}/personasList personas.

POST/api/agent/configs/{config_id}/personasCreate a persona.

POST/api/agent/configs/{config_id}/personas/bulkCreate many personas at once.

PUT/api/agent/configs/{config_id}/personas/{persona_id}Update a persona.

DEL/api/agent/configs/{config_id}/personas/{persona_id}Delete a persona.

POST/api/agent/configs/{config_id}/personas/generateGenerate synthetic personas with an LLM.

GET/api/agent/configs/{config_id}/personas/templatesPre-built persona archetypes.

Agent workbench

Agent experiments

Agent datasets, simulation sweeps, transcripts and G-Eval scoring.

GET/api/agent/datasetsList interaction datasets.

POST/api/agent/datasetsCreate an interaction dataset.

GET/api/agent/datasets/{dataset_id}Get interaction dataset by id.

DEL/api/agent/datasets/{dataset_id}Delete an interaction dataset.

POST/api/agent/datasets/{dataset_id}/duplicateDuplicate a dataset including items.

GET/api/agent/datasets/{dataset_id}/itemsList items.

POST/api/agent/datasets/{dataset_id}/itemsAdd an item.

DEL/api/agent/datasets/{dataset_id}/items/{item_id}Delete an item.

POST/api/agent/datasets/{dataset_id}/importImport items.

GET/api/agent/datasets/{dataset_id}/exportExport an interaction dataset.

POST/api/agent/datasets/generateGenerate an interaction dataset with an LLM.

GET/api/agent/experimentsList agent experiments.

POST/api/agent/experimentsCreate an agent experiment.

GET/api/agent/experiments/{exp_id}Get experiment by id.

DEL/api/agent/experiments/{exp_id}Delete an experiment.

POST/api/agent/experiments/bulk-deleteDelete many experiments at once.

GET/api/agent/experiments/{exp_id}/combinationsList combinations in an experiment.

POST/api/agent/experiments/{exp_id}/runRun all combinations (background).

POST/api/agent/experiments/{exp_id}/stopStop a running experiment.

POST/api/agent/experiments/{exp_id}/resumeResume a stopped or failed experiment.

GET/api/agent/experiments/{exp_id}/statusExperiment status counts.

GET/api/agent/experiments/{exp_id}/transcriptsList simulation transcripts.

GET/api/agent/transcripts/{transcript_id}Get a transcript by id.

GET/api/agent/transcripts/{transcript_id}/metricsMetrics for a transcript.

GET/api/agent/experiments/{exp_id}/resultsAggregated results for every combination.

POST/api/agent/experiments/{exp_id}/evaluateEvaluate all completed transcripts.

POST/api/agent/experiments/{exp_id}/report/generateGenerate an LLM-powered improvement report.

GET/api/agent/experiments/{exp_id}/reportFetch a previously-generated report.

Looking for more?

Schemas, request/response shapes and validation rules live in the live Swagger UI once the backend is running. Source for every router is at backend/app/api.

Everything theYoke Agent API exposes.

What you're looking at.

One command to get the stack up.

Everything hangs off /api.

JWT, with API keys as a shortcut.

Standard HTTP, JSON payload.

Auth

Users

MCP

Providers

Vector stores

Token usage

Traces

Chat

Documents

Datasets

Experiments

Collections

Agent configs

Agent experiments

Everything the
Yoke Agent API exposes.

Everything hangs off `/api`.