quantized llm

In this video we define the basics of quantization and look at how its benefits and how it affects large language models.

5:13

What is LLM quantization?

28,922 views

2 years ago

Julia Turc

How LLMs survive in low precision | Quantization Fundamentals

In this video, we discuss the fundamentals of model quantization, the technique that allows us to run inference on massive LLMs ...

20:34

How LLMs survive in low precision | Quantization Fundamentals

46,460 views

9 months ago

Matt Williams

Optimize Your AI - Quantization Explained

Run massive AI models on your laptop! Learn the secrets of LLM quantization and how q2, q4, and q8 settings in Ollama can save ...

12:10

Optimize Your AI - Quantization Explained

415,192 views

1 year ago

Adam Lucek

Quantizing LLMs - How & Why (8-Bit, 4-Bit, GGUF & More)

Quantizing models for maximum efficiency gains! Resources: Model Quantized: ...

26:26

Quantizing LLMs - How & Why (8-Bit, 4-Bit, GGUF & More)

22,959 views

1 year ago

New Machina

VIDEO TITLE What is LLM Quantization? ✍️VIDEO DESCRIPTION ✍️ Large Language Models (LLMs) are built using ...

9:57

What is LLM Quantization ?

3,100 views

11 months ago

BlueSpork

DeepSeek R1: Distilled & Quantized Models Explained

This video explores DeepSeek R1, how distilled versions and quantization make it more accessible, and the trade-offs between ...

3:47

DeepSeek R1: Distilled & Quantized Models Explained

23,456 views

1 year ago

Zachary Huang

Give me 30 min, I will make Quantization click forever

Text:* https://github.com/The-Pocket/PocketFlow-Tutorial-Video-Generator/blob/main/docs/llm/quantization.md 0:00:00 ...

32:42

Give me 30 min, I will make Quantization click forever

2,543 views

3 months ago

Matt Williams

5. Comparing Quantizations of the Same Model - Ollama Course

Welcome back to the Ollama course! In this lesson, we dive into the fascinating world of AI model quantization. Using variations of ...

10:29

5. Comparing Quantizations of the Same Model - Ollama Course

29,634 views

1 year ago

Gary Explains

Does LLM Size Matter? How Many Billions of Parameters do you REALLY Need?

Large Language Models (LLMs) are measured by the number of parameters they contain – the number of weights and biases ...

25:03

Does LLM Size Matter? How Many Billions of Parameters do you REALLY Need?

44,477 views

1 year ago

Julia Turc

The myth of 1-bit LLMs | Quantization-Aware Training

Are 1-bit LLMs the future of efficient AI? Or just a catchy Microsoft metaphor? In this video, we break down BitNet, the so-called ...

24:37

The myth of 1-bit LLMs | Quantization-Aware Training

88,670 views

9 months ago

Julia Turc

Reverse-engineering GGUF | Post-Training Quantization

The first comprehensive explainer for the GGUF quantization ecosystem. GGUF quantization is currently the most popular tool for ...

25:07

Reverse-engineering GGUF | Post-Training Quantization

50,160 views

8 months ago

Krish Naik

Part 1-Road To Learn Finetuning LLM With Custom Data-Quantization,LoRA,QLoRA Indepth Intuition

Quantization is a common technique used to reduce the model size, though it can sometimes result in reduced accuracy.

32:55

Part 1-Road To Learn Finetuning LLM With Custom Data-Quantization,LoRA,QLoRA Indepth Intuition

162,610 views

2 years ago

Efficient NLP

Quantization vs Pruning vs Distillation: Optimizing NNs for Inference

Try Voice Writer - speak your thoughts and let AI handle the grammar: https://voicewriter.io Four techniques to optimize the speed ...

19:46

Quantization vs Pruning vs Distillation: Optimizing NNs for Inference

61,338 views

2 years ago

Discover AI

LLM Quantization (Ollama, LM Studio): Any Performance Drop? TEST

A NEW benchmark and guide which quantization models to use locally on your PC or laptop. Either in Ollama or in LM Studio, ...

19:01

LLM Quantization (Ollama, LM Studio): Any Performance Drop? TEST

3,988 views

6 months ago

Umar Jamil

Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training

In this video I will introduce and explain quantization: we will first start with a little introduction on numerical representation of ...

50:55

Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training

51,617 views

2 years ago

AppliedAI

Understanding Model Quantization and Distillation in LLMs

Learn how model quantization and distillation—two key techniques for large model compression—help reduce costs and improve ...

4:54

Understanding Model Quantization and Distillation in LLMs

949 views

1 year ago

bycloud

1-Bit LLM: The Most Efficient LLM Possible?

Download Tanka today https://www.tanka.ai and enjoy 3 months of free Premium! You can also get $20 / team for each referrals ...

14:35

1-Bit LLM: The Most Efficient LLM Possible?

367,904 views

8 months ago

Codeically

I Made ChatGPT-2 Run on a Potato (63MB AI Model!) - Extreme Quantization Experiment What happens when you compress a ...

5:52

I Made The Smallest (And Dumbest) LLM

486,185 views

6 months ago

DeepFindr

LoRA explained (and a bit about precision and quantization)

Papers / Resources ▭▭▭ LoRA Paper: https://arxiv.org/abs/2106.09685 QLoRA Paper: https://arxiv.org/abs/2305.14314 ...

17:07

LoRA explained (and a bit about precision and quantization)

121,540 views

2 years ago

ViewTube