Glossary

Every term Yoke speaks, defined.

Short, opinionated definitions for the RAG and agent evaluation terms you will run into across the docs, the dashboard, and the reports. Each page has the formula (if there is one), a concrete example, and how Yoke Agent uses the term in practice.

RAG metric Faithfulness

Fraction of claims in the answer that are supported by the retrieved context.

RAG metric Answer relevancy

How well the answer responds to the question that was actually asked.

RAG metric Context precision

Signal-to-noise ratio of the chunks the retriever returned.

RAG metric Context recall

Fraction of the ground-truth information the retriever actually found.

RAG metric Noise sensitivity

How much answer quality drops when irrelevant chunks are injected into context.

RAG metric Entity recall

Fraction of named entities from the ground-truth answer actually recalled.

Quality signal Hallucination

Factual claims in the output that are not supported by retrieved context or input.

Framework G-Eval

LLM-as-judge evaluation using chain-of-thought rubrics and weighted scoring.

Method LLM-as-judge

Using an LLM to score another LLM’s output against a rubric.

Agent metric Tool-call accuracy

Whether an agent invoked the right tool with the right arguments.

Agent metric Refusal accuracy

When the agent refused, was the refusal correct? When it didn’t refuse, should it have?

Workflow Grid-search

Exhaustive evaluation of every combination in a defined configuration space.

Origin Poka-yoke

Japanese manufacturing term for a jig that makes it impossible to ship a defective part.