Upload date
All time
Last hour
Today
This week
This month
This year
Type
All
Video
Channel
Playlist
Movie
Duration
Short (< 4 minutes)
Medium (4-20 minutes)
Long (> 20 minutes)
Sort by
Relevance
Rating
View count
Features
HD
Subtitles/CC
Creative Commons
3D
Live
4K
360°
VR180
HDR
240 results
Welcome to CUDA Programming Day 5! Today, we step into the world of deep learning and explore how CUDA powers one of the ...
96 views
3 weeks ago
Layer Normalization was introduced to remove this dependency on batch-level statistics. Instead of normalizing across the batch, ...
0 views
9 days ago
Chinese guide Credits to Andrej Karpathy References: https://www.youtube.com/watch?v=VMj-3S1tku0 ...
4 views
4 weeks ago
Batch Normalization and Layer Normalization are techniques used in deep learning to stabilize and accelerate training by ...
10 views
... architectural tricks that made it work (*residual connections, layer normalization, and a shift from regression to classification*), ...
10,340 views
5 days ago
Delve into the core components of the Transformer Block: * Skip Connections (Add) and Layer Normalization (Norm): Essential ...
35 views
4 days ago
... encoding sentencepiece tokenizer embedding layer positional embeddings rotary positional embeddings layer normalization ...
496 views
Link: https://arxiv.org/abs/1706.03762 * Title: Layer Normalization. Link: https://arxiv.org/abs/1607.06450 * Title: Dropout: A Simple ...
169 views
2 weeks ago
The LayerNorm Solution: Why Layer Normalization is superior for Transformers because it operates "horizontally" across features, ...
41 views
Root Mean Square Layer Normalization. Link to Paper • RoPE: Su et al., 2021. RoFormer: Enhanced Transformer with Rotary ...
7 views
Residual Connections and Layer Normalization: why deep Transformers are stable and trainable. Rather than treating the ...
236 views
unit 3,4,5.
13 views
In this video, we summarize “Deep Learning” by Ian Goodfellow, Yoshua Bengio, and Aaron Courville — the definitive textbook ...
9 views
In this episode, we explore the three engineering pillars that made modern deep learning possible: advanced optimization ...
5 views
Batch Normalization is a key technique that makes deep neural networks faster, more stable, and easier to train. In this video, we ...
22 views
6 days ago
Batch Normalization is a technique used in deep learning to stabilize and accelerate neural network training by normalizing layer ...
6 hours ago
We identify the issue as stemming from the prevalent use of Pre-Layer Normalization (Pre-LN) and introduce LayerNorm Scaling ...
15 views
B-Trans(Population Bayesian Transformers)는 표준 대규모 언어 모델(LLM)을 베이지안 모델로 변환하여 단일 가중치 세트에서 다양 ...
3 days ago
Batch normalization is a crucial technique used in deep learning models to improve their performance and stability. When training ...
18 views