Efficient Distributed Orthonormal Optimizers For

Media Summary: Speaker: Kwangjun Ahn, Microsoft Research I delivered a 50-minute technical talk on recent advances in Kwangjun Ahn, Senior Researcher at Microsoft Research AI Frontiers, introduces Dion, a next-generation Welcome to our deep dive into the world of

Efficient Distributed Orthonormal Optimizers For - Detailed Analysis & Overview

Speaker: Kwangjun Ahn, Microsoft Research I delivered a 50-minute technical talk on recent advances in Kwangjun Ahn, Senior Researcher at Microsoft Research AI Frontiers, introduces Dion, a next-generation Welcome to our deep dive into the world of Muon is fundamentally changing how we approach large-scale deep learning Speaker: Dilip Krishnaswamy, Quantum Walks The 2nd IEEE SA Open RAN Summit, hosted by the Johns Hopkins University ... Problems in areas such as machine learning and dynamic

Artificial Intelligence (AI) 20 May 2021 Speaker: Hadrien Hendrikx, INRIA (collaboration with Francis Bach, Laurent Massoulié, ... Join NVIDIA, Gcore, and Orange for a technical deep dive into deploying and scaling AI inference with NVIDIA Dynamo—an ... Dorylus: Affordable, Scalable, and Accurate GNN Training with Autonomy Talks - 11/26/25 Speaker: Prof. Navid Azizan, MIT Title: Towards Reliable and

Photo Gallery

Efficient Distributed Orthonormal Optimizers for Large-Scale Training

Dion: The distributed orthonormal update revolution is here

Dion: Distributed Orthonormalized Updates

Who's Adam and What's He Optimizing? | Deep Dive into Optimizers for Machine Learning!

Making orthonormal updates more scalable - Kwangjun Ahn ｜ ASAP 44

The Muon Optimizer: How Newton-Schulz Enables 2x Faster LLM Training (AdamW Killer?)

Quantum and Distributed AI/ML processing for Wireless Network Optimization

Distributed Optimization via Alternating Direction Method of Multipliers

Statistical Preconditioning for Distributed Optimization | JRC Workshop 2021

Distributed AI Inference at Scale on NVIDIA Dynamo With Gcore and Orange Business

NSDI '26 - SYMI: Efficient Mixture-of-Experts Training via Model and Optimizer State Decoupling

OSDI '21 - Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and

View Detailed Profile

Efficient Distributed Orthonormal Optimizers for Large-Scale Training

Efficient Distributed Orthonormal Optimizers for Large-Scale Training

Speaker: Kwangjun Ahn, Microsoft Research I delivered a 50-minute technical talk on recent advances in

Dion: The distributed orthonormal update revolution is here

Dion: The distributed orthonormal update revolution is here

Kwangjun Ahn, Senior Researcher at Microsoft Research AI Frontiers, introduces Dion, a next-generation

Dion: Distributed Orthonormalized Updates

Dion: Distributed Orthonormalized Updates

Dion:

Who's Adam and What's He Optimizing? | Deep Dive into Optimizers for Machine Learning!

Who's Adam and What's He Optimizing? | Deep Dive into Optimizers for Machine Learning!

Welcome to our deep dive into the world of

Making orthonormal updates more scalable - Kwangjun Ahn ｜ ASAP 44

Making orthonormal updates more scalable - Kwangjun Ahn ｜ ASAP 44

Paper: https://arxiv.org/abs/2504.05295 Speaker: https://www.microsoft.com/en-us/research/people/kwangjunahn/

The Muon Optimizer: How Newton-Schulz Enables 2x Faster LLM Training (AdamW Killer?)

The Muon Optimizer: How Newton-Schulz Enables 2x Faster LLM Training (AdamW Killer?)

Muon is fundamentally changing how we approach large-scale deep learning

Quantum and Distributed AI/ML processing for Wireless Network Optimization

Quantum and Distributed AI/ML processing for Wireless Network Optimization

Speaker: Dilip Krishnaswamy, Quantum Walks The 2nd IEEE SA Open RAN Summit, hosted by the Johns Hopkins University ...

Distributed Optimization via Alternating Direction Method of Multipliers

Distributed Optimization via Alternating Direction Method of Multipliers

Problems in areas such as machine learning and dynamic

Statistical Preconditioning for Distributed Optimization | JRC Workshop 2021

Statistical Preconditioning for Distributed Optimization | JRC Workshop 2021

Artificial Intelligence (AI) 20 May 2021 Speaker: Hadrien Hendrikx, INRIA (collaboration with Francis Bach, Laurent Massoulié, ...

Distributed AI Inference at Scale on NVIDIA Dynamo With Gcore and Orange Business

Distributed AI Inference at Scale on NVIDIA Dynamo With Gcore and Orange Business

Join NVIDIA, Gcore, and Orange for a technical deep dive into deploying and scaling AI inference with NVIDIA Dynamo—an ...

NSDI '26 - SYMI: Efficient Mixture-of-Experts Training via Model and Optimizer State Decoupling

NSDI '26 - SYMI: Efficient Mixture-of-Experts Training via Model and Optimizer State Decoupling

SYMI:

OSDI '21 - Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and

OSDI '21 - Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and

Dorylus: Affordable, Scalable, and Accurate GNN Training with

Autonomy Talks - Navid Azizan: Towards Reliable and Efficient AI-Enabled Autonomy

Autonomy Talks - Navid Azizan: Towards Reliable and Efficient AI-Enabled Autonomy

Autonomy Talks - 11/26/25 Speaker: Prof. Navid Azizan, MIT Title: Towards Reliable and