View Detailed Profile
ML Performance Reading Group Session 16: LMCache

ML Performance Reading Group Session 16: LMCache

Paper: LMCache (https://arxiv.org/pdf/2510.09665) Presenter: A. Mahmood Slides: ...

ML Performance Reading Group Session 24: Flash Attention 4

ML Performance Reading Group Session 24: Flash Attention 4

ML Performance Reading Group Session

ML Performance Reading Group Session 19: Speculative Decoding

ML Performance Reading Group Session 19: Speculative Decoding

Session

ML Performance Reading Group Session 18: Kimi Delta Attention

ML Performance Reading Group Session 18: Kimi Delta Attention

Presenter: Daniel Vega-Myhre, with part by wave_function Paper: https://arxiv.org/pdf/2510.26692.

ML Performance Reading Group Session 2: Flash Attention

ML Performance Reading Group Session 2: Flash Attention

ML Performance Reading Group Session

ML Performance Reading Group Session 25: Prefill as a Service

ML Performance Reading Group Session 25: Prefill as a Service

Paper: https://www.alphaxiv.org/abs/2604.15039v1 Slides: ...

ML Performance Reading Group Session 15: Megablocks

ML Performance Reading Group Session 15: Megablocks

Paper: Megablocks (https://arxiv.org/pdf/2211.15841) Presenter: rdyro.

ML Performance Reading Group Session 5: Paged Attention

ML Performance Reading Group Session 5: Paged Attention

ML Performance Reading Group Session

ML Performance Reading Group Session 1: GPU Architecture, CUDA, NCCL

ML Performance Reading Group Session 1: GPU Architecture, CUDA, NCCL

ML Performance

ML Performance Reading Group Session 17: MXFP8 Training for MoEs with TorchAO

ML Performance Reading Group Session 17: MXFP8 Training for MoEs with TorchAO

Presenter: Daniel Vega-Myhre Code: https://github.com/pytorch/ao/tree/main/torchao/prototype/moe_training.

ML Performance Reading Group Session 20: Native Sparse Attention

ML Performance Reading Group Session 20: Native Sparse Attention

Paper: https://arxiv.org/abs/2502.11089 Presenter: arshadm@

ML Performance Reading Group Session 8: Megatron-LM

ML Performance Reading Group Session 8: Megatron-LM

ML Performance Reading Group Session

ML Performance Reading Group Session 11: Async Tensor Parallelism

ML Performance Reading Group Session 11: Async Tensor Parallelism

ML Performance Reading Group Session