Media Summary: In this AI Research Roundup episode, Alex discusses the paper: ' Recording of a live panel featuring WireMock, StrongDM, Docker, and LocalStack. With AI generating Kent Beck is one of the most influential figures in modern software development. Creator of Extreme

Naturebench Testing Coding Agents On - Detailed Analysis & Overview

In this AI Research Roundup episode, Alex discusses the paper: ' Recording of a live panel featuring WireMock, StrongDM, Docker, and LocalStack. With AI generating Kent Beck is one of the most influential figures in modern software development. Creator of Extreme FastContext: Training Efficient Repository Explorer for

Photo Gallery

NatureBench: Testing Coding Agents on Science
NatureBench: can AI agents beat science's record, not just copy it?
I Sandboxed My Coding Agents. You Should Too.
What Is Agentic Coding? How AI Agents Modernize Code
The future of test environments for agentic coding
Coding Agents Can Cheat—And This Paper Catches Them
TDD, AI agents and coding with Kent Beck
The Coding Agent Platform for Software Testing
Microsoft Just Made AI Coding Agents 10x More Efficient
How to Test AI Agents: Simulating Real-World Scenarios
Guide to Agentic AI – Build a Python Coding Agent with Gemini
This Coding Benchmark Finally Punishes Fake Agents
View Detailed Profile
NatureBench: Testing Coding Agents on Science

NatureBench: Testing Coding Agents on Science

In this AI Research Roundup episode, Alex discusses the paper: '

NatureBench: can AI agents beat science's record, not just copy it?

NatureBench: can AI agents beat science's record, not just copy it?

NatureBench tests

I Sandboxed My Coding Agents. You Should Too.

I Sandboxed My Coding Agents. You Should Too.

Coding agents

What Is Agentic Coding? How AI Agents Modernize Code

What Is Agentic Coding? How AI Agents Modernize Code

Learn more about Agentic

The future of test environments for agentic coding

The future of test environments for agentic coding

Recording of a live panel featuring WireMock, StrongDM, Docker, and LocalStack. With AI generating

Coding Agents Can Cheat—And This Paper Catches Them

Coding Agents Can Cheat—And This Paper Catches Them

The big shift here is that

TDD, AI agents and coding with Kent Beck

TDD, AI agents and coding with Kent Beck

Kent Beck is one of the most influential figures in modern software development. Creator of Extreme

The Coding Agent Platform for Software Testing

The Coding Agent Platform for Software Testing

Sick of random AI

Microsoft Just Made AI Coding Agents 10x More Efficient

Microsoft Just Made AI Coding Agents 10x More Efficient

FastContext: Training Efficient Repository Explorer for

How to Test AI Agents: Simulating Real-World Scenarios

How to Test AI Agents: Simulating Real-World Scenarios

You finish the build, run the

Guide to Agentic AI – Build a Python Coding Agent with Gemini

Guide to Agentic AI – Build a Python Coding Agent with Gemini

Build your own functional AI

This Coding Benchmark Finally Punishes Fake Agents

This Coding Benchmark Finally Punishes Fake Agents

DeepSWE is a

The 100% EASIEST Way to Test LLMs & AI Agents (Seriously)

The 100% EASIEST Way to Test LLMs & AI Agents (Seriously)

Learn how to professionally