shikib 2 hours ago

An ideal evaluation system should surface failure modes and usage patterns, provide actionable insights, adapt to your product -- and continuously improve over time.

Nothing out there met our eval needs.

So we built AgentLens: a multi-agent system which surfaces deeper insights and continuously learns what matters for your product.

Checkout our blogpost for a deep dive on the system and example dashboards produced by AgentLens!

sheshansh 2 hours ago

Love that evaluation outputs are interactive