Media Summary: In this video our very own Minko Gechev will take us through Run massive AI models on your laptop! Learn the secrets of LLM quantization and how q2, q4, and q8 settings in Ollama can save ... Get the guide to GAI, learn more → Learn more about the technology → Join Cedric ...
4 Runtime Performance Optimizations - Detailed Analysis & Overview
In this video our very own Minko Gechev will take us through Run massive AI models on your laptop! Learn the secrets of LLM quantization and how q2, q4, and q8 settings in Ollama can save ... Get the guide to GAI, learn more → Learn more about the technology → Join Cedric ... Here's the one change that took mine from ~120 tok/s to 1200+ without a new GPU. TryHackMe just launched Cyber Security 101 ... Uhhh I couldn't think of a description thing 0:00 - Intro 0:25 - Tip 1 1:13 - Tip 2 1:55 - Tip 3 2:12 - Tip