Media Summary: From Fully Connected 2023* Join Stella Binderman, Executive Director of EleutherAI and Head of What's happening inside an AI model as it thinks? Why are AI models sycophantic, and why do they hallucinate? Are AI models ... How can we reverse engineer what a neural network is doing? In this IASEAI '25 session, An Introduction to Mechanistic ...

How Interpretability Research Helps Build - Detailed Analysis & Overview

From Fully Connected 2023* Join Stella Binderman, Executive Director of EleutherAI and Head of What's happening inside an AI model as it thinks? Why are AI models sycophantic, and why do they hallucinate? Are AI models ... How can we reverse engineer what a neural network is doing? In this IASEAI '25 session, An Introduction to Mechanistic ... A surprising fact about modern large language models is that nobody really knows how they work internally. At Anthropic, the ... Stanford AI Lab Faculty Lunch, November 7, 2025. Updated version of 0:59 ... This is a talk I gave to my MATS 9.0 training scholars about the big picture of mech interp - as of Oct 2025, what had changed?

Been Kim (Google Brain) Frontiers of Deep Learning. Take your personal data back with Incogni! Use code WELCHLABS at the link below and get 60% off an annual plan: ... MIT 6.S897 Machine Learning for Healthcare, Spring 2019 Instructor: Peter Szolovits View the complete course: ... When Anthropic tested Claude Sonnet 4.5 for alignment, the model appeared perfectly behaved — but it turned out the model had ... CS 7180: Neural Mechanics Spring 2026 Course at Northeastern University Modern AI systems are powerful but opaque: even ...

Photo Gallery

How Interpretability Research Helps Build Better Models
How interpretability paves the way for building an explainable AI system
Interpretability: Understanding how AI models think
An Introduction to Mechanistic Interpretability – Neel Nanda | IASEAI 2025
What is interpretability?
Assessing skeptical views of interpretability research
What Matters Right Now In Mechanistic Interpretability?
Interpretability - now what?
The Dark Matter of AI [Mechanistic Interpretability]
25. Interpretability
Neel Nanda - Our Pivot To Pragmatic Interpretability [Alignment Workshop]
A Roadmap for the Rigorous Science of Interpretability | Finale Doshi-Velez | Talks at Google
View Detailed Profile
How Interpretability Research Helps Build Better Models

How Interpretability Research Helps Build Better Models

From Fully Connected 2023* Join Stella Binderman, Executive Director of EleutherAI and Head of

How interpretability paves the way for building an explainable AI system

How interpretability paves the way for building an explainable AI system

Check out Ajay Thampi's book

Interpretability: Understanding how AI models think

Interpretability: Understanding how AI models think

What's happening inside an AI model as it thinks? Why are AI models sycophantic, and why do they hallucinate? Are AI models ...

An Introduction to Mechanistic Interpretability – Neel Nanda | IASEAI 2025

An Introduction to Mechanistic Interpretability – Neel Nanda | IASEAI 2025

How can we reverse engineer what a neural network is doing? In this IASEAI '25 session, An Introduction to Mechanistic ...

What is interpretability?

What is interpretability?

A surprising fact about modern large language models is that nobody really knows how they work internally. At Anthropic, the ...

Assessing skeptical views of interpretability research

Assessing skeptical views of interpretability research

Stanford AI Lab Faculty Lunch, November 7, 2025. Updated version of https://web.stanford.edu/~cgpotts/blog/interp/ 0:59 ...

What Matters Right Now In Mechanistic Interpretability?

What Matters Right Now In Mechanistic Interpretability?

This is a talk I gave to my MATS 9.0 training scholars about the big picture of mech interp - as of Oct 2025, what had changed?

Interpretability - now what?

Interpretability - now what?

Been Kim (Google Brain) https://simons.berkeley.edu/talks/tbd-72 Frontiers of Deep Learning.

The Dark Matter of AI [Mechanistic Interpretability]

The Dark Matter of AI [Mechanistic Interpretability]

Take your personal data back with Incogni! Use code WELCHLABS at the link below and get 60% off an annual plan: ...

25. Interpretability

25. Interpretability

MIT 6.S897 Machine Learning for Healthcare, Spring 2019 Instructor: Peter Szolovits View the complete course: ...

Neel Nanda - Our Pivot To Pragmatic Interpretability [Alignment Workshop]

Neel Nanda - Our Pivot To Pragmatic Interpretability [Alignment Workshop]

When Anthropic tested Claude Sonnet 4.5 for alignment, the model appeared perfectly behaved — but it turned out the model had ...

A Roadmap for the Rigorous Science of Interpretability | Finale Doshi-Velez | Talks at Google

A Roadmap for the Rigorous Science of Interpretability | Finale Doshi-Velez | Talks at Google

With a growing interest in

Introduction to Mechanistic Interpretability with David Bau

Introduction to Mechanistic Interpretability with David Bau

CS 7180: Neural Mechanics Spring 2026 Course at Northeastern University Modern AI systems are powerful but opaque: even ...