ViewTube

ViewTube
Sign inSign upSubscriptions
Filters

Upload date

Type

Duration

Sort by

Features

Reset

32 results

3cycle
Intro to GPU programming with CUDA

A 'Math Club' talk, by 2swap!

1:09:17
Intro to GPU programming with CUDA

2,341 views

4 days ago

Sonsie Face
Nvidia CUDA vs Apple Metal for AI Work

I take a look at the different architectures for creating and using AI, and Apple's Metal is surprisingly good in some scenarios.

5:22
Nvidia CUDA vs Apple Metal for AI Work

436 views

4 days ago

Cihangir Tezcan
CUDA Optimization of NSA's Cipher: SPECK

Full Course: https://www.youtube.com/playlist?list=PLUoixF7agmIujuNg-OLK4GyoHYulgOCFi CUDA source codes and the paper: ...

8:21
CUDA Optimization of NSA's Cipher: SPECK

22 views

2 days ago

Tinge Zhang
20260303 StitchCUDA Automated GPU Programming

StitchCUDA is an automated multi-agent framework designed to generate and optimize end-to-end GPU programs for complex ...

9:27
20260303 StitchCUDA Automated GPU Programming

0 views

6 days ago

Priyam Mazumdar
Triton Grouped Matrix Multiplication (Almost CUDA Performance!) | A MyTorch Sidequest

Code: https://github.com/priyammaz/TritonKernels/tree/main We implement Grouped Matrix Multiplication that simply reorganizes ...

36:19
Triton Grouped Matrix Multiplication (Almost CUDA Performance!) | A MyTorch Sidequest

99 views

5 days ago

DIY Smart Code
Ubuntu 26.04 Just Killed GPU Driver Hell Forever

This video highlights Ubuntu's significant role in powering the majority of ai workloads in 2026, detailing its hardware ...

8:59
Ubuntu 26.04 Just Killed GPU Driver Hell Forever

6,536 views

6 days ago

Engineering Study Desk
NPTEL GPU Architecture and Programming Week 8 Assignment 8 Answers 2026 (100 % Correct)

gpu #gpuarchitecture #cuda #programming #parallelprocessing #assignmentsolutions #nptelsolutions #coding #Enggstudydesk ...

2:20
NPTEL GPU Architecture and Programming Week 8 Assignment 8 Answers 2026 (100 % Correct)

61 views

6 days ago

Cuda Education
GPU Programming | Compute Particles PART 3.1 | Shader link for attach to cursor feature | Vulkan API

In this episode, I investigate the link between the shader on the GPU side and the mouse position data on the CPU/Operating ...

18:21
GPU Programming | Compute Particles PART 3.1 | Shader link for attach to cursor feature | Vulkan API

0 views

12 hours ago

Vinh Nguyen
Tile the Tensors

... learning curve for these abstractions is steep, they offer significant benefits for writing portable, high-performance CUDA code.

6:43
Tile the Tensors

6 views

4 days ago

EuroCC FRANCE - CC-FR and LAPP CNRS
GPU Programming with PyTorch: Beyond Deep Learning

Can PyTorch do more than deep learning? Absolutely — including simulating physical systems like the Gray-Scott reaction. In this ...

1:08:03
GPU Programming with PyTorch: Beyond Deep Learning

25 views

6 days ago

Tinge Zhang
20260227 CUDA Agent: Large-Scale Agentic RLfor High-Performance CUDA Kernel Generation

To address the extreme technical difficulty of CUDA programming, the researchers developed a scalable data synthesis pipeline ...

8:42
20260227 CUDA Agent: Large-Scale Agentic RLfor High-Performance CUDA Kernel Generation

0 views

6 days ago

Lukasz Gawenda
Qwen 3.5 Vision – The ONLY LOCAL Setup YOU NEED (No Ollama/LM Studio)! It's INSANE!

Qwen3.5 Vision AI locally — full image & video support using llama.cpp + vLLM in Docker. No Ollama, no LM Studio. The ONLY ...

13:16
Qwen 3.5 Vision – The ONLY LOCAL Setup YOU NEED (No Ollama/LM Studio)! It's INSANE!

1,096 views

6 days ago

SciPulse
CUDA Agent: Large-Scale Agentic RL for High-Performance GPU Kernel Generation

*Key Insights & Contributions:* • *The Three-Stage Data Pipeline* — To overcome the scarcity of expert CUDA code, the ...

7:11
CUDA Agent: Large-Scale Agentic RL for High-Performance GPU Kernel Generation

69 views

1 day ago

Ray Fernando
OpenClaw's Creator Says Use This Plugin

The founder of OpenClaw just recommended a third-party plugin over his own built-in memory system. Pete Steinberger tweeted ...

1:47:03
OpenClaw's Creator Says Use This Plugin

4,419 views

Streamed 1 day ago

Afrokit Media
Rope Space Worm Face Swap Installation

Rope Space Worm Face Swap Installation. This video guide will teach you how to install the Rope Space Worm Version.

16:40
Rope Space Worm Face Swap Installation

142 views

4 days ago

The Oracle Guy: AI Unlocked
I Made Qwen3 TTS 10x Faster (Run It Locally)

Many of you asked for a macOS option to run top TTS models locally. So I built OpenVox — a local AI voice studio for Mac that ...

24:30
I Made Qwen3 TTS 10x Faster (Run It Locally)

3,002 views

6 days ago

Eran Feit
Make YOLOv8 10x Faster with Nvidia TensorRT

Stop settling for slow inference speeds. If you want to Make YOLOv8 10x Faster with Nvidia TensorRT, this is the only tutorial you ...

17:29
Make YOLOv8 10x Faster with Nvidia TensorRT

25 views

18 hours ago

NeoSamurai
Say goodbye to ChatGPT: Create your own Local AI in Proxmox (STEP BY STEP)

Have you run out of cloud storage credits or are you worried about your data privacy? Today we're going to turn Kaito into a ...

53:45
Say goodbye to ChatGPT: Create your own Local AI in Proxmox (STEP BY STEP)

672 views

3 days ago

Solo Swift Crafter
Nvidia Has a Problem. It's Called Apple.

Come be part of the crew: https://www.crafterslab.dev Your MacBook might already be outperforming a $2000 Nvidia GPU for the ...

12:27
Nvidia Has a Problem. It's Called Apple.

4,907 views

5 days ago

Lossfunk
Building AI Systems That Write Their Own GPU Kernels

How can LLMs help improve the systems that run them? In this talk, Manoj explores how AI can help discover faster GPU kernels ...

1:02:29
Building AI Systems That Write Their Own GPU Kernels

55 views

15 hours ago