Media Summary: I work for Qualcomm/Snapdragon, and I'm very into AI. So I wanted to test the limits of how big an AI model you can Focuses on the "napkin math" and ROI. Stop wasting money on inference. Most AI spend happens in production, not training. Here's the one change that took mine from ~120 tok/s to 1200+ without a new GPU. TryHackMe just launched Cyber Security 101 ...

Run 100b Parameter Llms On - Detailed Analysis & Overview

I work for Qualcomm/Snapdragon, and I'm very into AI. So I wanted to test the limits of how big an AI model you can Focuses on the "napkin math" and ROI. Stop wasting money on inference. Most AI spend happens in production, not training. Here's the one change that took mine from ~120 tok/s to 1200+ without a new GPU. TryHackMe just launched Cyber Security 101 ... Dave tests llama3.1 and llama3.2 using Ollama on a Raspberry Pi, a Herk Orion Mini PC, a 3970X, an M2 Mac Pro, and a ... Deep dive: Microsoft open sources BitNet: I wired four Mac Studios together and loaded a 1 Trillion

Download Tanka today and enjoy 3 months of free Premium! You can also get $20 / team for each referrals ... This is the stack that gets me over 4000 tokens per second locally. Download Docker Desktop here: to ... I put a tiny MacBook Air between me and some ridiculously large local AI models... and it worked. Power Your Spring Essentials ... Note: This video is generated by AI. I designed a workflow that uses OpenCode to investigate open-source repositories to write a ...

Photo Gallery

Running 100B Parameter LLM Model on Snapdragon X Elite, Microsoft Surface Laptop 7
Run 100B+ Parameter LLMs on a Single GPU: Quantization Explained!
Your local LLM is 10x slower than it should be
Does LLM Size Matter? How Many Billions of Parameters do you REALLY Need?
Run Local LLMs on Hardware from $50 to $50,000 - We Test and Compare!
Optimize Your AI - Quantization Explained
Microsoft open sources BitNet: 100B parameter LLM running on a single CPU via 1.58-bit ternary weigh
I Ran a Trillion Parameter AI on a Mac... Here’s the Secret
1-Bit LLM: The Most Efficient LLM Possible?
THIS is the REAL DEAL 🤯 for local LLMs
Private AI on the go… a new trick
4 levels of LLMs (on the go)
View Detailed Profile
Running 100B Parameter LLM Model on Snapdragon X Elite, Microsoft Surface Laptop 7

Running 100B Parameter LLM Model on Snapdragon X Elite, Microsoft Surface Laptop 7

I work for Qualcomm/Snapdragon, and I'm very into AI. So I wanted to test the limits of how big an AI model you can

Run 100B+ Parameter LLMs on a Single GPU: Quantization Explained!

Run 100B+ Parameter LLMs on a Single GPU: Quantization Explained!

Focuses on the "napkin math" and ROI. Stop wasting money on inference. Most AI spend happens in production, not training.

Your local LLM is 10x slower than it should be

Your local LLM is 10x slower than it should be

Here's the one change that took mine from ~120 tok/s to 1200+ without a new GPU. TryHackMe just launched Cyber Security 101 ...

Does LLM Size Matter? How Many Billions of Parameters do you REALLY Need?

Does LLM Size Matter? How Many Billions of Parameters do you REALLY Need?

Large Language Models (

Run Local LLMs on Hardware from $50 to $50,000 - We Test and Compare!

Run Local LLMs on Hardware from $50 to $50,000 - We Test and Compare!

Dave tests llama3.1 and llama3.2 using Ollama on a Raspberry Pi, a Herk Orion Mini PC, a 3970X, an M2 Mac Pro, and a ...

Optimize Your AI - Quantization Explained

Optimize Your AI - Quantization Explained

Run

Microsoft open sources BitNet: 100B parameter LLM running on a single CPU via 1.58-bit ternary weigh

Microsoft open sources BitNet: 100B parameter LLM running on a single CPU via 1.58-bit ternary weigh

Deep dive: Microsoft open sources BitNet:

I Ran a Trillion Parameter AI on a Mac... Here’s the Secret

I Ran a Trillion Parameter AI on a Mac... Here’s the Secret

I wired four Mac Studios together and loaded a 1 Trillion

1-Bit LLM: The Most Efficient LLM Possible?

1-Bit LLM: The Most Efficient LLM Possible?

Download Tanka today https://www.tanka.ai and enjoy 3 months of free Premium! You can also get $20 / team for each referrals ...

THIS is the REAL DEAL 🤯 for local LLMs

THIS is the REAL DEAL 🤯 for local LLMs

This is the stack that gets me over 4000 tokens per second locally. Download Docker Desktop here: https://dockr.ly/4mOdGMO to ...

Private AI on the go… a new trick

Private AI on the go… a new trick

I put a tiny MacBook Air between me and some ridiculously large local AI models... and it worked. Power Your Spring Essentials ...

4 levels of LLMs (on the go)

4 levels of LLMs (on the go)

I put four portable systems to the local

BitNet: Run 100B AI Models on Your CPU — No GPU Needed

BitNet: Run 100B AI Models on Your CPU — No GPU Needed

Note: This video is generated by AI. I designed a workflow that uses OpenCode to investigate open-source repositories to write a ...