Evaluating Agents With Braintrust

Media Summary: Most teams approach evals like unit tests and try to cover every possible failure. Phil Hetzel from This hands-on workshop guides participants through the full AI Jason Lopatecki, Co-Founder and CEO of Arize AI, dives into the world of

Evaluating Agents With Braintrust - Detailed Analysis & Overview

Most teams approach evals like unit tests and try to cover every possible failure. Phil Hetzel from This hands-on workshop guides participants through the full AI Jason Lopatecki, Co-Founder and CEO of Arize AI, dives into the world of In this video, we walk through the complete eval workflow, including creating datasets, prompts, and scorers. Traditional observability answers one question: is the system up? Phil Hetzel from Your production traces show how customers are using your

We've now moved on to evals for multi-turn conversations in

Photo Gallery

Evaluating Agents with Braintrust

How to evaluate AI agents with Braintrust

The maturity phases of running evals — Phil Hetzel, Braintrust

Evals 101 — Doug Guthrie, Braintrust

Evaluating Agents and Assistants: The AI Conference

Evaluating agents: how we built Loop, the AI assistant for evals

Intro to Evals with Braintrust

How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust

Intro to Loop: the AI agent for evals and observability in Braintrust

Turning traces into agent decisions

Intro to Braintrust: AI Observability and Evals

Braintrust and Box on AI agents and the future of AI observability

View Detailed Profile

Evaluating Agents with Braintrust

Evaluating Agents with Braintrust

Greylock Change

How to evaluate AI agents with Braintrust

How to evaluate AI agents with Braintrust

Join

The maturity phases of running evals — Phil Hetzel, Braintrust

The maturity phases of running evals — Phil Hetzel, Braintrust

Most teams approach evals like unit tests and try to cover every possible failure. Phil Hetzel from

Evals 101 — Doug Guthrie, Braintrust

Evals 101 — Doug Guthrie, Braintrust

This hands-on workshop guides participants through the full AI

Evaluating Agents and Assistants: The AI Conference

Evaluating Agents and Assistants: The AI Conference

Jason Lopatecki, Co-Founder and CEO of Arize AI, dives into the world of

Evaluating agents: how we built Loop, the AI assistant for evals

Evaluating agents: how we built Loop, the AI assistant for evals

Join Doug Guthrie, Solutions Engineer at

Intro to Evals with Braintrust

Intro to Evals with Braintrust

In this video, we walk through the complete eval workflow, including creating datasets, prompts, and scorers.

How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust

How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust

Traditional observability answers one question: is the system up? Phil Hetzel from

Intro to Loop: the AI agent for evals and observability in Braintrust

Intro to Loop: the AI agent for evals and observability in Braintrust

Learn how you can use Loop in

Turning traces into agent decisions

Turning traces into agent decisions

Your production traces show how customers are using your

Intro to Braintrust: AI Observability and Evals

Intro to Braintrust: AI Observability and Evals

An end-to-end walkthrough of

Braintrust and Box on AI agents and the future of AI observability

Braintrust and Box on AI agents and the future of AI observability

BrainTrust

Evals Course: Analyzing multi turn traces

Evals Course: Analyzing multi turn traces

We've now moved on to evals for multi-turn conversations in