An ideal evaluation system should surface failure modes and usage patterns, provide actionable insights, adapt to your product -- and continuously improve over time.
Nothing out there met our eval needs.
So we built AgentLens: a multi-agent system which surfaces deeper insights and continuously learns what matters for your product.
Checkout our blogpost for a deep dive on the system and example dashboards produced by AgentLens!
An ideal evaluation system should surface failure modes and usage patterns, provide actionable insights, adapt to your product -- and continuously improve over time.
Nothing out there met our eval needs.
So we built AgentLens: a multi-agent system which surfaces deeper insights and continuously learns what matters for your product.
Checkout our blogpost for a deep dive on the system and example dashboards produced by AgentLens!
Love that evaluation outputs are interactive