Mechanistic Interpretability Explained Understanding How

What is mechanistic interpretability? Neel Nanda explains.

Art by @hamishdoodles Clipped from episode 19 of AXRP: https://youtu.be/3YbE7zybc5k?t=64 Transcript of that episode: ...

Lex Fridman Podcast full episode: https://www.youtube.com/watch?v=ugvHCXCOmm4 Thank you for listening ❤ Check out our ...

A surprising fact about modern large language models is that nobody really knows how they work internally. At Anthropic, the ...

This is a talk I gave to my MATS 9.0 training scholars about the big picture of mech interp - as of Oct 2025, what had changed?

Take your personal data back with Incogni! Use code WELCHLABS at the link below and get 60% off an annual plan: ...

How can we reverse engineer what a neural network is doing? In this IASEAI '25 session, An Introduction to

What's happening inside an AI model as it thinks? Why are AI models sycophantic, and why do they hallucinate? Are AI models ...

Check out Gradient now and redeem your free 5$ credits! https://gradient.1stcollab.com/bycloud Solving AI Doomerism: ...

May 13, 2025 Large language models do many things, and it's not clear from black-box interactions how they do them. We will ...

Learn about

This is a talk I gave to my MATS scholars, with a stylised history of the field of

Neural networks have become increasingly impressive in recent years, but there's a big catch: we don't really know what they are ...

0:00 Introduction and Agenda 0:40