Spcl Bcast Cloud Scale Inference

Media Summary: Speaker: Steve Reinhardt Venue: SPCL_Bcast, recorded on 8 April, 2021 Abstract: Microsoft's Project Catapult began nearly a ... Presenter(s): Hasan Siraj, Head of Software Products, Broadcom As AI models continue to grow in complexity- both training and ... Speaker: Virginia Smith Venue: SPCL_Bcast , recorded on 13th October, 2023 Abstract: To deploy machine learning models ...

Spcl Bcast Cloud Scale Inference - Detailed Analysis & Overview

Speaker: Steve Reinhardt Venue: SPCL_Bcast, recorded on 8 April, 2021 Abstract: Microsoft's Project Catapult began nearly a ... Presenter(s): Hasan Siraj, Head of Software Products, Broadcom As AI models continue to grow in complexity- both training and ... Speaker: Virginia Smith Venue: SPCL_Bcast , recorded on 13th October, 2023 Abstract: To deploy machine learning models ... Speaker: Maksim Khadkevich, Sr. Software Engineering Manager, Dynamo, NVIDIA Khadkevich discusses data center Speaker: Dan Ernst Venue: SPCL_Bcast, recorded on 16th February, 2023 Abstract: Memory has passed compute as the most ... Speaker: Peter Pietzuch Venue: SPCL_Bcast , recorded on 18th April, 2024 Abstract: More and more data-intensive ...

Hey everyone, In this video, I showcase how LLM Two GPU kernels can compute the exact same attention, on the same chip, with identical inputs and identical outputs, and one still ... Speaker: Petar Veličković Venue: SPCL_Bcast , recorded on 21st March, 2024 Abstract: What makes a neural network better, ... Speakers: Cen Zhao, Xiaodong Wang, and Jianyu Huang Learn more here: ... Don't miss out! Join us at our next KubeCon + CloudNativeCon events in Mumbai, India (18-19 June, 2026), Yokohama, Japan ... Speaker: Abhinav Bhatele Venue: SPCL_Bcast, recorded on 2nd March, 2023 Abstract: The rapid increase in memory capacity ...

Photo Gallery

[SPCL_Bcast] Cloud-Scale Inference on FPGAs at Microsoft Bing

Infer() Summit 2026: Inference at Scale

Distributed Computing @ Scale for AI Training & Inference

[SPCL_Bcast] Evaluating Large-Scale Learning Systems

Improving LLM Throughput via Data Center-Scale Inference Optimizations

[SPCL_Bcast] Follow the Data: Memory-Centric Designs for Modern Datacenters

[SPCL_Bcast #48] Improving Cloud Security with Hardware Memory Capabilities

Inference Is the Bottleneck Now: How to Architect LLM Serving in 2026 (vLLM, GPUs, Decentralized)

The Engineering Behind LLM Inference: Kernels and Memory

[SPCL_Bcast #46] Capturing Computation with Algorithmic Alignment

Inference Deployments and Comms Implication - Live from SCC

Evolving KServe: The Unified Model Inference Platform for Both Predictive and... F. Spolti & J. Lee

View Detailed Profile

[SPCL_Bcast] Cloud-Scale Inference on FPGAs at Microsoft Bing

[SPCL_Bcast] Cloud-Scale Inference on FPGAs at Microsoft Bing

Speaker: Steve Reinhardt Venue: SPCL_Bcast, recorded on 8 April, 2021 Abstract: Microsoft's Project Catapult began nearly a ...

Infer() Summit 2026: Inference at Scale

Infer() Summit 2026: Inference at Scale

Inference

Distributed Computing @ Scale for AI Training & Inference

Distributed Computing @ Scale for AI Training & Inference

Presenter(s): Hasan Siraj, Head of Software Products, Broadcom As AI models continue to grow in complexity- both training and ...

[SPCL_Bcast] Evaluating Large-Scale Learning Systems

[SPCL_Bcast] Evaluating Large-Scale Learning Systems

Speaker: Virginia Smith Venue: SPCL_Bcast #41, recorded on 13th October, 2023 Abstract: To deploy machine learning models ...

Improving LLM Throughput via Data Center-Scale Inference Optimizations

Improving LLM Throughput via Data Center-Scale Inference Optimizations

Speaker: Maksim Khadkevich, Sr. Software Engineering Manager, Dynamo, NVIDIA Khadkevich discusses data center

[SPCL_Bcast] Follow the Data: Memory-Centric Designs for Modern Datacenters

[SPCL_Bcast] Follow the Data: Memory-Centric Designs for Modern Datacenters

Speaker: Dan Ernst Venue: SPCL_Bcast, recorded on 16th February, 2023 Abstract: Memory has passed compute as the most ...

[SPCL_Bcast #48] Improving Cloud Security with Hardware Memory Capabilities

[SPCL_Bcast #48] Improving Cloud Security with Hardware Memory Capabilities

Speaker: Peter Pietzuch Venue: SPCL_Bcast #48, recorded on 18th April, 2024 Abstract: More and more data-intensive ...

Inference Is the Bottleneck Now: How to Architect LLM Serving in 2026 (vLLM, GPUs, Decentralized)

Inference Is the Bottleneck Now: How to Architect LLM Serving in 2026 (vLLM, GPUs, Decentralized)

Hey everyone, In this video, I showcase how LLM

The Engineering Behind LLM Inference: Kernels and Memory

The Engineering Behind LLM Inference: Kernels and Memory

Two GPU kernels can compute the exact same attention, on the same chip, with identical inputs and identical outputs, and one still ...

[SPCL_Bcast #46] Capturing Computation with Algorithmic Alignment

[SPCL_Bcast #46] Capturing Computation with Algorithmic Alignment

Speaker: Petar Veličković Venue: SPCL_Bcast #46, recorded on 21st March, 2024 Abstract: What makes a neural network better, ...

Inference Deployments and Comms Implication - Live from SCC

Inference Deployments and Comms Implication - Live from SCC

Speakers: Cen Zhao, Xiaodong Wang, and Jianyu Huang Learn more here: ...

Evolving KServe: The Unified Model Inference Platform for Both Predictive and... F. Spolti & J. Lee

Evolving KServe: The Unified Model Inference Platform for Both Predictive and... F. Spolti & J. Lee

Don't miss out! Join us at our next KubeCon + CloudNativeCon events in Mumbai, India (18-19 June, 2026), Yokohama, Japan ...

[SPCL_Bcast] HPC and AI/ML: A Synergistic Relationship

[SPCL_Bcast] HPC and AI/ML: A Synergistic Relationship

Speaker: Abhinav Bhatele Venue: SPCL_Bcast, recorded on 2nd March, 2023 Abstract: The rapid increase in memory capacity ...