Media Summary: A Google TechTalk, presented by Ekaterina Kochetkova, 2025-10-23 ABSTRACT: The memory requirements of LLM inference ... Sasho Nikolov, University of Toronto Discrete Optimization Michael Kapralov, IBM T.J. Watson Research Center Information Theory in Complexity Theory and Combinatorics ...

Streaming Attention Approximation Via Discrepancy - Detailed Analysis & Overview

A Google TechTalk, presented by Ekaterina Kochetkova, 2025-10-23 ABSTRACT: The memory requirements of LLM inference ... Sasho Nikolov, University of Toronto Discrete Optimization Michael Kapralov, IBM T.J. Watson Research Center Information Theory in Complexity Theory and Combinatorics ... Author: Noah Singer, Madhu Sudan and Santhoshini Velusamy. Nick Harvey presents as part of the UBC Department of Computer Science's Faculty Lecture Series, November 13, 2014. A hit rate ... Authors: Idan Attias (Ben-Gurion University); Edith Cohen (Google and Tel Aviv University); Moshe Shechner (Tel Aviv University); ...

Current approaches to software model checking can be divided into over- Source: In this episode we discuss Efficient Val Tannen (University of Pennsylvania) ... This paper introduces StreamingLLM, an efficient framework that allows large language models to generalize to infinite sequence ... Time is Money. Understanding application responsiveness and latency is critical but good characterization of bad data is useless.

Photo Gallery

Streaming Attention Approximation via Discrepancy Theory
Discrepancy and Approximation Algorithms
Streaming Lower Bounds for Approximating MAX-CUT
Streaming approximation resistance of every ordering CSP
Rethinking Attention with Performers (Paper Explained)
Nick Harvey - Approximating Hit Rate Curves using Streaming Algorithms
A Framework for Adversarial Streaming via Differential Privacy and Difference Estimators
From Under-approximations to Over-approximations and Back
Adversarial Streaming, Differential Privacy, and Adaptive Data Analysis
arxiv Preprint - Efficient Streaming Language Models with Attention Sinks
Streaming (Synchronous), Recursion, and Incremental Computation
Efficient Streaming Language Models with Attention Sinks
View Detailed Profile
Streaming Attention Approximation via Discrepancy Theory

Streaming Attention Approximation via Discrepancy Theory

A Google TechTalk, presented by Ekaterina Kochetkova, 2025-10-23 ABSTRACT: The memory requirements of LLM inference ...

Discrepancy and Approximation Algorithms

Discrepancy and Approximation Algorithms

Sasho Nikolov, University of Toronto https://simons.berkeley.edu/talks/sasho-nikolov-09-11-17 Discrete Optimization

Streaming Lower Bounds for Approximating MAX-CUT

Streaming Lower Bounds for Approximating MAX-CUT

Michael Kapralov, IBM T.J. Watson Research Center Information Theory in Complexity Theory and Combinatorics ...

Streaming approximation resistance of every ordering CSP

Streaming approximation resistance of every ordering CSP

Author: Noah Singer, Madhu Sudan and Santhoshini Velusamy.

Rethinking Attention with Performers (Paper Explained)

Rethinking Attention with Performers (Paper Explained)

ai #research #

Nick Harvey - Approximating Hit Rate Curves using Streaming Algorithms

Nick Harvey - Approximating Hit Rate Curves using Streaming Algorithms

Nick Harvey presents as part of the UBC Department of Computer Science's Faculty Lecture Series, November 13, 2014. A hit rate ...

A Framework for Adversarial Streaming via Differential Privacy and Difference Estimators

A Framework for Adversarial Streaming via Differential Privacy and Difference Estimators

Authors: Idan Attias (Ben-Gurion University); Edith Cohen (Google and Tel Aviv University); Moshe Shechner (Tel Aviv University); ...

From Under-approximations to Over-approximations and Back

From Under-approximations to Over-approximations and Back

Current approaches to software model checking can be divided into over-

Adversarial Streaming, Differential Privacy, and Adaptive Data Analysis

Adversarial Streaming, Differential Privacy, and Adaptive Data Analysis

Streaming

arxiv Preprint - Efficient Streaming Language Models with Attention Sinks

arxiv Preprint - Efficient Streaming Language Models with Attention Sinks

Source: https://www.podbean.com/eau/pb-6b48f-14bed92 In this episode we discuss Efficient

Streaming (Synchronous), Recursion, and Incremental Computation

Streaming (Synchronous), Recursion, and Incremental Computation

Val Tannen (University of Pennsylvania) ...

Efficient Streaming Language Models with Attention Sinks

Efficient Streaming Language Models with Attention Sinks

This paper introduces StreamingLLM, an efficient framework that allows large language models to generalize to infinite sequence ...

"How NOT to Measure Latency" by Gil Tene

"How NOT to Measure Latency" by Gil Tene

Time is Money. Understanding application responsiveness and latency is critical but good characterization of bad data is useless.