ViewTube

ViewTube
Sign inSign upSubscriptions
Filters

Upload date

Type

Duration

Sort by

Features

Reset

3,685 results

Airtrain AI
What is LLM quantization?

In this video we define the basics of quantization and look at how its benefits and how it affects large language models.

5:13
What is LLM quantization?

28,922 views

2 years ago

Julia Turc
How LLMs survive in low precision | Quantization Fundamentals

In this video, we discuss the fundamentals of model quantization, the technique that allows us to run inference on massive LLMs ...

20:34
How LLMs survive in low precision | Quantization Fundamentals

46,460 views

9 months ago

Matt Williams
Optimize Your AI - Quantization Explained

Run massive AI models on your laptop! Learn the secrets of LLM quantization and how q2, q4, and q8 settings in Ollama can save ...

12:10
Optimize Your AI - Quantization Explained

415,192 views

1 year ago

Adam Lucek
Quantizing LLMs - How & Why (8-Bit, 4-Bit, GGUF & More)

Quantizing models for maximum efficiency gains! Resources: Model Quantized: ...

26:26
Quantizing LLMs - How & Why (8-Bit, 4-Bit, GGUF & More)

22,959 views

1 year ago

New Machina
What is LLM Quantization ?

VIDEO TITLE What is LLM Quantization? ✍️VIDEO DESCRIPTION ✍️ Large Language Models (LLMs) are built using ...

9:57
What is LLM Quantization ?

3,100 views

11 months ago

BlueSpork
DeepSeek R1: Distilled & Quantized Models Explained

This video explores DeepSeek R1, how distilled versions and quantization make it more accessible, and the trade-offs between ...

3:47
DeepSeek R1: Distilled & Quantized Models Explained

23,456 views

1 year ago

Zachary Huang
Give me 30 min, I will make Quantization click forever

Text:* https://github.com/The-Pocket/PocketFlow-Tutorial-Video-Generator/blob/main/docs/llm/quantization.md 0:00:00 ...

32:42
Give me 30 min, I will make Quantization click forever

2,543 views

3 months ago

Matt Williams
5. Comparing Quantizations of the Same Model - Ollama Course

Welcome back to the Ollama course! In this lesson, we dive into the fascinating world of AI model quantization. Using variations of ...

10:29
5. Comparing Quantizations of the Same Model - Ollama Course

29,634 views

1 year ago

Gary Explains
Does LLM Size Matter? How Many Billions of Parameters do you REALLY Need?

Large Language Models (LLMs) are measured by the number of parameters they contain – the number of weights and biases ...

25:03
Does LLM Size Matter? How Many Billions of Parameters do you REALLY Need?

44,477 views

1 year ago

Julia Turc
The myth of 1-bit LLMs | Quantization-Aware Training

Are 1-bit LLMs the future of efficient AI? Or just a catchy Microsoft metaphor? In this video, we break down BitNet, the so-called ...

24:37
The myth of 1-bit LLMs | Quantization-Aware Training

88,670 views

9 months ago

Julia Turc
Reverse-engineering GGUF | Post-Training Quantization

The first comprehensive explainer for the GGUF quantization ecosystem. GGUF quantization is currently the most popular tool for ...

25:07
Reverse-engineering GGUF | Post-Training Quantization

50,160 views

8 months ago

Krish Naik
Part 1-Road To Learn Finetuning LLM With Custom Data-Quantization,LoRA,QLoRA Indepth Intuition

Quantization is a common technique used to reduce the model size, though it can sometimes result in reduced accuracy.

32:55
Part 1-Road To Learn Finetuning LLM With Custom Data-Quantization,LoRA,QLoRA Indepth Intuition

162,610 views

2 years ago

Efficient NLP
Quantization vs Pruning vs Distillation: Optimizing NNs for Inference

Try Voice Writer - speak your thoughts and let AI handle the grammar: https://voicewriter.io Four techniques to optimize the speed ...

19:46
Quantization vs Pruning vs Distillation: Optimizing NNs for Inference

61,338 views

2 years ago

Discover AI
LLM Quantization (Ollama, LM Studio): Any Performance Drop? TEST

A NEW benchmark and guide which quantization models to use locally on your PC or laptop. Either in Ollama or in LM Studio, ...

19:01
LLM Quantization (Ollama, LM Studio): Any Performance Drop? TEST

3,988 views

6 months ago

Umar Jamil
Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training

In this video I will introduce and explain quantization: we will first start with a little introduction on numerical representation of ...

50:55
Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training

51,617 views

2 years ago

AppliedAI
Understanding Model Quantization and Distillation in LLMs

Learn how model quantization and distillation—two key techniques for large model compression—help reduce costs and improve ...

4:54
Understanding Model Quantization and Distillation in LLMs

949 views

1 year ago

bycloud
1-Bit LLM: The Most Efficient LLM Possible?

Download Tanka today https://www.tanka.ai and enjoy 3 months of free Premium! You can also get $20 / team for each referrals ...

14:35
1-Bit LLM: The Most Efficient LLM Possible?

367,904 views

8 months ago

Codeically
I Made The Smallest (And Dumbest) LLM

I Made ChatGPT-2 Run on a Potato (63MB AI Model!) - Extreme Quantization Experiment What happens when you compress a ...

5:52
I Made The Smallest (And Dumbest) LLM

486,185 views

6 months ago

DeepFindr
LoRA explained (and a bit about precision and quantization)

Papers / Resources ▭▭▭ LoRA Paper: https://arxiv.org/abs/2106.09685 QLoRA Paper: https://arxiv.org/abs/2305.14314 ...

17:07
LoRA explained (and a bit about precision and quantization)

121,540 views

2 years ago