Media Summary: Vincent Caldeira (Field CTO at Red Hat) and Valentina Rodriguez Sosa (Principal Architect at Red Hat) map out a comprehensive ... Want to learn real AI Engineering? Go here: Want to start freelancing? Let me help: ... This is where I like to say, I have always been a really big believer in context-

Testing The Untestable Evaluation Driven - Detailed Analysis & Overview

Vincent Caldeira (Field CTO at Red Hat) and Valentina Rodriguez Sosa (Principal Architect at Red Hat) map out a comprehensive ... Want to learn real AI Engineering? Go here: Want to start freelancing? Let me help: ... This is where I like to say, I have always been a really big believer in context- The hard truth: Most AI projects don't fail because of the model — they fail because nobody clearly defined what success looks ... During my journey as a programmer, trainer and mentor I have encountered multiple situations when my colleagues or myself ... This talk was recorded at NDC Copenhagen in Copenhagen, Denmark.  ...

Ready to become a certified watsonx AI Assistant Engineer? Register now and use code IBMTechYT20 for 20% off of your Description This episode explores the shift from manual “vibe Enterprise AI agents aren't defined by demos; they're defined by

Photo Gallery

Testing the untestable: Computer Software Validation of AI functionality
Assessing AI performance with Evaluation-Driven Development
Testing the Untestable: Evaluation-Driven Development for Financial AI Agents
How to Systematically Setup LLM Evals (Metrics, Unit Tests, LLM-as-a-Judge)
Human-in-the-Loop Evaluation with NIMBUS Uno | AI Testing, Interpretability & GenAI Reliability
Context-Driven Testing
Advanced Agent Testing Using Evaluations
How to Evaluate LLMs in Production - AI Testing Framework with Hamel Hussain
Piotr Stawirej: Testing the untestable - patterns and use cases analysis | JDD 2023
Beyond the Prompt: Evaluating, Testing, and Securing LLM Applications - Mete Atamel
LLM as a Judge: Scaling AI Evaluation Strategies
From Vibe Testing to Eval Driven AI Testing - Mar 26, 2026
View Detailed Profile
Testing the untestable: Computer Software Validation of AI functionality

Testing the untestable: Computer Software Validation of AI functionality

AI fundamentally shifts software

Assessing AI performance with Evaluation-Driven Development

Assessing AI performance with Evaluation-Driven Development

Test

Testing the Untestable: Evaluation-Driven Development for Financial AI Agents

Testing the Untestable: Evaluation-Driven Development for Financial AI Agents

Vincent Caldeira (Field CTO at Red Hat) and Valentina Rodriguez Sosa (Principal Architect at Red Hat) map out a comprehensive ...

How to Systematically Setup LLM Evals (Metrics, Unit Tests, LLM-as-a-Judge)

How to Systematically Setup LLM Evals (Metrics, Unit Tests, LLM-as-a-Judge)

Want to learn real AI Engineering? Go here: https://go.datalumina.com/iIO93Ps Want to start freelancing? Let me help: ...

Human-in-the-Loop Evaluation with NIMBUS Uno | AI Testing, Interpretability & GenAI Reliability

Human-in-the-Loop Evaluation with NIMBUS Uno | AI Testing, Interpretability & GenAI Reliability

Human-in-the-Loop (HITL)

Context-Driven Testing

Context-Driven Testing

This is where I like to say, I have always been a really big believer in context-

Advanced Agent Testing Using Evaluations

Advanced Agent Testing Using Evaluations

This video shows how to use the

How to Evaluate LLMs in Production - AI Testing Framework with Hamel Hussain

How to Evaluate LLMs in Production - AI Testing Framework with Hamel Hussain

The hard truth: Most AI projects don't fail because of the model — they fail because nobody clearly defined what success looks ...

Piotr Stawirej: Testing the untestable - patterns and use cases analysis | JDD 2023

Piotr Stawirej: Testing the untestable - patterns and use cases analysis | JDD 2023

During my journey as a programmer, trainer and mentor I have encountered multiple situations when my colleagues or myself ...

Beyond the Prompt: Evaluating, Testing, and Securing LLM Applications - Mete Atamel

Beyond the Prompt: Evaluating, Testing, and Securing LLM Applications - Mete Atamel

This talk was recorded at NDC Copenhagen in Copenhagen, Denmark. #ndccopenhagen #ndcconferences #developer ...

LLM as a Judge: Scaling AI Evaluation Strategies

LLM as a Judge: Scaling AI Evaluation Strategies

Ready to become a certified watsonx AI Assistant Engineer? Register now and use code IBMTechYT20 for 20% off of your

From Vibe Testing to Eval Driven AI Testing - Mar 26, 2026

From Vibe Testing to Eval Driven AI Testing - Mar 26, 2026

Description This episode explores the shift from manual “vibe

Evaluation-driven development for enterprise AI agents

Evaluation-driven development for enterprise AI agents

Enterprise AI agents aren't defined by demos; they're defined by