Evaluation (AI)

Source: content/manual/06-glossary/ai/evaluation.md

Definition

Measuring agent-assisted work against baselines (cycle time, corrections, defects, cost) to determine value.

Why it matters

Avoids cargo-cult adoption and ensures guardrails don’t erase benefits.

Common pitfalls

  • No baseline before pilots.
  • Vanity metrics without quality checks.

References