Evaluation (AI)
Source: content/manual/06-glossary/ai/evaluation.md
Definition
Measuring agent-assisted work against baselines (cycle time, corrections, defects, cost) to determine value.
Why it matters
Avoids cargo-cult adoption and ensures guardrails don’t erase benefits.
Common pitfalls
- No baseline before pilots.
- Vanity metrics without quality checks.
