Mech Interp Reading Group Tracing

Media Summary: Join us in this session as we dive into " Join us in this session as we dive into "Circuit Join us in this session as we dive into "Learning a Generative Meta-Model of LLM Activations" by Grace Luo, Jiahai Feng, Trevor ...

Mech Interp Reading Group Tracing - Detailed Analysis & Overview

Join us in this session as we dive into " Join us in this session as we dive into "Circuit Join us in this session as we dive into "Learning a Generative Meta-Model of LLM Activations" by Grace Luo, Jiahai Feng, Trevor ... Join us in this session as we dive into "Open Problems in Mechanistic interpretability" by Lee Sharkey et al.! Read the article ... From PhD research on grounding and language models to shipping interpretability tools in production at *Goodfire,* *Jack ... Join us in this session as we dive into "Negation Neglect: When models fail to learn negations in training" by Harry Mayne, Lev ...

Join us in this session as we dive into "The Dead Salmons of AI Interpretability" by Maxime Méloux, Giada Dirupo, François Portet, ... ERRATA: - Scaling DOES change the composition term. We were wrong about the form of scaling, and we're updating the results ... We are happy to welcome the next round of our AI paper Slides: We covered most of transformer circuits, and will cover ... This is a talk I gave to my MATS scholars, with a stylised history of the field of mechanistic interpretability, as I see it (with a focus ... Paper Link: Most recent ML models in reaction prediction often fail to ...

Photo Gallery

Mech Interp Reading Group - Tracing Attention Computation Through Feature Interactions

Mech Interp Reading Group - Tracing the thoughts of a large language model

Mech Interp Reading Group - Circuit Tracing: Revealing Computational Graphs in Language Models

Mech Interp Reading Group - Learning a Generative Meta-Model of LLM Activations

Mech Interp Reading Group - Open Problems in Mechanistic interpretability

[State of MechInterp] SAEs in Production, Circuit Tracing, AI4Science, "Pragmatic" Interp — Goodfire

Mech Interp Reading Group - Negation Neglect: When models fail to learn negations in training

Mech Interp Reading Group - The Dead Salmons of AI Interpretability

EleutherAI Interpretability Reading Group 220604: Attention head bandwidth connectomes of LLMs.

Paper Reading Group - Learning and Embodiment for Robotic Manipulation

Mechanistic Interpretability, Part 1 | ML@P Reading Group | Jinen Setpal

The Story of Mech Interp

View Detailed Profile

Mech Interp Reading Group - Tracing Attention Computation Through Feature Interactions

Mech Interp Reading Group - Tracing Attention Computation Through Feature Interactions

Join us in this session as we dive into "

Mech Interp Reading Group - Tracing the thoughts of a large language model

Mech Interp Reading Group - Tracing the thoughts of a large language model

Join us in this session as we dive into "

Mech Interp Reading Group - Circuit Tracing: Revealing Computational Graphs in Language Models

Mech Interp Reading Group - Circuit Tracing: Revealing Computational Graphs in Language Models

Join us in this session as we dive into "Circuit

Mech Interp Reading Group - Learning a Generative Meta-Model of LLM Activations

Mech Interp Reading Group - Learning a Generative Meta-Model of LLM Activations

Join us in this session as we dive into "Learning a Generative Meta-Model of LLM Activations" by Grace Luo, Jiahai Feng, Trevor ...

Mech Interp Reading Group - Open Problems in Mechanistic interpretability

Mech Interp Reading Group - Open Problems in Mechanistic interpretability

Join us in this session as we dive into "Open Problems in Mechanistic interpretability" by Lee Sharkey et al.! Read the article ...

[State of MechInterp] SAEs in Production, Circuit Tracing, AI4Science, "Pragmatic" Interp — Goodfire

[State of MechInterp] SAEs in Production, Circuit Tracing, AI4Science, "Pragmatic" Interp — Goodfire

From PhD research on grounding and language models to shipping interpretability tools in production at *Goodfire,* *Jack ...

Mech Interp Reading Group - Negation Neglect: When models fail to learn negations in training

Mech Interp Reading Group - Negation Neglect: When models fail to learn negations in training

Join us in this session as we dive into "Negation Neglect: When models fail to learn negations in training" by Harry Mayne, Lev ...

Mech Interp Reading Group - The Dead Salmons of AI Interpretability

Mech Interp Reading Group - The Dead Salmons of AI Interpretability

Join us in this session as we dive into "The Dead Salmons of AI Interpretability" by Maxime Méloux, Giada Dirupo, François Portet, ...

EleutherAI Interpretability Reading Group 220604: Attention head bandwidth connectomes of LLMs.

EleutherAI Interpretability Reading Group 220604: Attention head bandwidth connectomes of LLMs.

ERRATA: - Scaling DOES change the composition term. We were wrong about the form of scaling, and we're updating the results ...

Paper Reading Group - Learning and Embodiment for Robotic Manipulation

Paper Reading Group - Learning and Embodiment for Robotic Manipulation

We are happy to welcome the next round of our AI paper

Mechanistic Interpretability, Part 1 | ML@P Reading Group | Jinen Setpal

Mechanistic Interpretability, Part 1 | ML@P Reading Group | Jinen Setpal

Slides: https://cs.purdue.edu/homes/jsetpal/slides/mechinterp.pdf We covered most of transformer circuits, and will cover ...

The Story of Mech Interp

The Story of Mech Interp

This is a talk I gave to my MATS scholars, with a stylised history of the field of mechanistic interpretability, as I see it (with a focus ...

Electron Flow Matching for Generative Reaction Mechanism Prediction | LeMaterial Reading Group

Electron Flow Matching for Generative Reaction Mechanism Prediction | LeMaterial Reading Group

Paper Link: https://www.nature.com/articles/s41586-025-09426-9 Most recent ML models in reaction prediction often fail to ...