Media Summary: In this AI Research Roundup episode, Alex discusses the paper: ' In this talk, Ernst Haagsman, Product Leader at JetBrains, shares his expertise on scaling developer tools from his early days on ... John Yang is a PhD student at Stanford and the creator of the SWE-

Iso Bench Can Coding Agents - Detailed Analysis & Overview

In this AI Research Roundup episode, Alex discusses the paper: ' In this talk, Ernst Haagsman, Product Leader at JetBrains, shares his expertise on scaling developer tools from his early days on ... John Yang is a PhD student at Stanford and the creator of the SWE-

Photo Gallery

ISO-Bench: Can Coding Agents Optimize Real-World Inference Workloads?
ISO-Bench: Benchmarking LLM Optimization Agents
Benchmarking AI Agents Against Realistic Analytical Tasks with ADE-bench
Practical AI Coding Agent Evaluation with SWE-bench, TeamCity, and Juni | Ernst Haagsman
How I Turned Pi Into the Ultimate Coding Agent
The emerging skillset of wielding coding agents — Beyang Liu, Sourcegraph / Amp
Creating Quality tasks for benchmarking AI Agents on Terminal Bench
This FREE AI Coding Agent Just Hit 70.6% on SWE-Bench (Runs Locally, Apache 2.0)
Beyond SWE-Bench Pro - Where do Agents go from Here?
What Is Agentic Coding? How AI Agents Modernize Code
I Sandboxed My Coding Agents. You Should Too.
Benchtalks #2: From SWE-bench to ProgramBench: The Future of Coding Benchmarks with John Yang
View Detailed Profile
ISO-Bench: Can Coding Agents Optimize Real-World Inference Workloads?

ISO-Bench: Can Coding Agents Optimize Real-World Inference Workloads?

Paper:

ISO-Bench: Benchmarking LLM Optimization Agents

ISO-Bench: Benchmarking LLM Optimization Agents

In this AI Research Roundup episode, Alex discusses the paper: '

Benchmarking AI Agents Against Realistic Analytical Tasks with ADE-bench

Benchmarking AI Agents Against Realistic Analytical Tasks with ADE-bench

[2026 - DAY 2 -

Practical AI Coding Agent Evaluation with SWE-bench, TeamCity, and Juni | Ernst Haagsman

Practical AI Coding Agent Evaluation with SWE-bench, TeamCity, and Juni | Ernst Haagsman

In this talk, Ernst Haagsman, Product Leader at JetBrains, shares his expertise on scaling developer tools from his early days on ...

How I Turned Pi Into the Ultimate Coding Agent

How I Turned Pi Into the Ultimate Coding Agent

Pi has quickly become my favorite

The emerging skillset of wielding coding agents — Beyang Liu, Sourcegraph / Amp

The emerging skillset of wielding coding agents — Beyang Liu, Sourcegraph / Amp

It's raining

Creating Quality tasks for benchmarking AI Agents on Terminal Bench

Creating Quality tasks for benchmarking AI Agents on Terminal Bench

Ever wondered how we actually test AI

This FREE AI Coding Agent Just Hit 70.6% on SWE-Bench (Runs Locally, Apache 2.0)

This FREE AI Coding Agent Just Hit 70.6% on SWE-Bench (Runs Locally, Apache 2.0)

Alibaba just released Qwen3-

Beyond SWE-Bench Pro - Where do Agents go from Here?

Beyond SWE-Bench Pro - Where do Agents go from Here?

Yanis He (SWE-

What Is Agentic Coding? How AI Agents Modernize Code

What Is Agentic Coding? How AI Agents Modernize Code

Learn more about Agentic

I Sandboxed My Coding Agents. You Should Too.

I Sandboxed My Coding Agents. You Should Too.

Coding agents

Benchtalks #2: From SWE-bench to ProgramBench: The Future of Coding Benchmarks with John Yang

Benchtalks #2: From SWE-bench to ProgramBench: The Future of Coding Benchmarks with John Yang

John Yang is a PhD student at Stanford and the creator of the SWE-

The $3 Trillion AI Coding Opportunity

The $3 Trillion AI Coding Opportunity

AI