Media Summary: Been Kim (Google Brain) Emerging Challenges in Deep Learning. A surprising fact about modern large language models is that nobody really knows how they work internally. At Anthropic, the ... Been Kim (Google Brain) Frontiers of Deep Learning.

How To Fail Interpretability Research - Detailed Analysis & Overview

Been Kim (Google Brain) Emerging Challenges in Deep Learning. A surprising fact about modern large language models is that nobody really knows how they work internally. At Anthropic, the ... Been Kim (Google Brain) Frontiers of Deep Learning. A talk I gave to my MATS 9.0 training program about reasoning model Take your personal data back with Incogni! Use code WELCHLABS at the link below and get 60% off an annual plan: ... When Anthropic tested Claude Sonnet 4.5 for alignment, the model appeared perfectly behaved — but it turned out the model had ...

MLHC 2022 - Been Kim: Don't do it Emmanuel! How to stop worrying about MIT 6.S897 Machine Learning for Healthcare, Spring 2019 Instructor: Peter Szolovits View the complete course: ... How can we reverse engineer what a neural network is doing? In this IASEAI '25 session, An Introduction to Mechanistic ... Check out Gradient now and redeem your free 5$ credits! Solving AI Doomerism: ... Shop the new merch! - shoptensor.com Diving deep into the fascinating world of mechanistic This talk was recorded at NDC AI in Oslo, Norway. Attend the next NDC ...

Photo Gallery

How to Fail Interpretability Research
What is interpretability?
Interpretability - now what?
Interpretability: Understanding how AI models think
How Reasoning Models Break Mechanistic Interpretability Techniques
The Dark Matter of AI [Mechanistic Interpretability]
Neel Nanda - Our Pivot To Pragmatic Interpretability [Alignment Workshop]
MLHC 2022 - Been Kim: How to stop worrying about interpretability, and start making progress
25. Interpretability
An Introduction to Mechanistic Interpretability – Neel Nanda | IASEAI 2025
Reading AI's Mind - Mechanistic Interpretability Explained [Anthropic Research]
Mechanistic Interpretability
View Detailed Profile
How to Fail Interpretability Research

How to Fail Interpretability Research

Been Kim (Google Brain) https://simons.berkeley.edu/talks/tba-90 Emerging Challenges in Deep Learning.

What is interpretability?

What is interpretability?

A surprising fact about modern large language models is that nobody really knows how they work internally. At Anthropic, the ...

Interpretability - now what?

Interpretability - now what?

Been Kim (Google Brain) https://simons.berkeley.edu/talks/tbd-72 Frontiers of Deep Learning.

Interpretability: Understanding how AI models think

Interpretability: Understanding how AI models think

Read more about Anthropic's

How Reasoning Models Break Mechanistic Interpretability Techniques

How Reasoning Models Break Mechanistic Interpretability Techniques

A talk I gave to my MATS 9.0 training program about reasoning model

The Dark Matter of AI [Mechanistic Interpretability]

The Dark Matter of AI [Mechanistic Interpretability]

Take your personal data back with Incogni! Use code WELCHLABS at the link below and get 60% off an annual plan: ...

Neel Nanda - Our Pivot To Pragmatic Interpretability [Alignment Workshop]

Neel Nanda - Our Pivot To Pragmatic Interpretability [Alignment Workshop]

When Anthropic tested Claude Sonnet 4.5 for alignment, the model appeared perfectly behaved — but it turned out the model had ...

MLHC 2022 - Been Kim: How to stop worrying about interpretability, and start making progress

MLHC 2022 - Been Kim: How to stop worrying about interpretability, and start making progress

MLHC 2022 - Been Kim: Don't do it Emmanuel! How to stop worrying about

25. Interpretability

25. Interpretability

MIT 6.S897 Machine Learning for Healthcare, Spring 2019 Instructor: Peter Szolovits View the complete course: ...

An Introduction to Mechanistic Interpretability – Neel Nanda | IASEAI 2025

An Introduction to Mechanistic Interpretability – Neel Nanda | IASEAI 2025

How can we reverse engineer what a neural network is doing? In this IASEAI '25 session, An Introduction to Mechanistic ...

Reading AI's Mind - Mechanistic Interpretability Explained [Anthropic Research]

Reading AI's Mind - Mechanistic Interpretability Explained [Anthropic Research]

Check out Gradient now and redeem your free 5$ credits! https://gradient.1stcollab.com/bycloud Solving AI Doomerism: ...

Mechanistic Interpretability

Mechanistic Interpretability

Shop the new merch! - shoptensor.com Diving deep into the fascinating world of mechanistic

Between the Layers– Interpreting Large Language Models - Michelle Frost - NDC AI 2025

Between the Layers– Interpreting Large Language Models - Michelle Frost - NDC AI 2025

This talk was recorded at NDC AI in Oslo, Norway. #ndcai #ndcconferences #developer #softwaredeveloper Attend the next NDC ...