Upload date
All time
Last hour
Today
This week
This month
This year
Type
All
Video
Channel
Playlist
Movie
Duration
Short (< 4 minutes)
Medium (4-20 minutes)
Long (> 20 minutes)
Sort by
Relevance
Rating
View count
Features
HD
Subtitles/CC
Creative Commons
3D
Live
4K
360°
VR180
HDR
1,884 results
This video evaluates the top AI code review bots to see how well they handle modern full-stack application code, specifically ...
4,175 views
4 days ago
Andrej Karpathy's Autoresearch demonstrates autonomous agent loops: agents edit training code, run fixed five-minute ...
35,113 views
https://arxiv.org/pdf/2603.03823 SWE-CI: Evaluating Long-Term Code Maintainability via Continuous Integration Agents The ...
70 views
6 days ago
Get ALL of our systems & join hundreds of AI builders in our community ...
77,801 views
7 days ago
In this AI Research Roundup episode, Alex discusses the paper: 'SWE-CI: Evaluating Agent Capabilities in Maintaining ...
49 views
Visit Mixture of Experts podcast page to get more AI content → https://ibm.biz/BdpqsM Can your AI agent hack its own evaluation?
2,500 views
1 day ago
Anthropic has launched "Claude Code Review", a sophisticated tool designed to automate the evaluation of GitHub pull requests ...
28 views
I'll show you exactly where AI-generated code fell short, and what human code review caught that the tests never would. What we ...
538 views
2 days ago
Many retrieval-augmented generation (RAG) and code-search pipelines rely on ad-hoc checks and break when deployed at scale ...
3 views
6 hours ago
AI is not a junior engineer. It's the most senior engineer without context. At The Future of Frontend in Dublin, Kesha Mykhailov from ...
218 views
The approach replaces sporadic maintenance with ongoing AI driven code evaluation that steadily refines performance and ...
177 views
Stop relying on a single metric to judge your AI. Most AI teams face a massive "evaluation blind spot." Your model might score ...
4 views
5 days ago
Your LLM passed the demo. It failed production. Here's how to fix that. Most teams ship RAG pipelines with zero evaluation — no ...
134 views
... 0:21 — Real-time scoring: 5/10 with breakdown 0:30 — One-click evaluation report 0:38 — Screenshot code review 0:43 — The ...
7 views
https://www.anthropic.com/engineering/eval-awareness-browsecomp Eval Awareness in Claude Opus 4.6 BrowseComp ...
2 views
Looking for a working Tradeify discount code? You can use code INVEST to receive 33% OFF your Tradeify futures trading ...
9 views
Want to plan, build, evaluate, customize, and deploy your agentic AI solutions right from your IDE? The AI Toolkit accelerates ...
662 views
Streamed 5 days ago
In this episode, Sid Pardeshi, co-founder and CTO of Blitzy, joins us to discuss building autonomous development systems able to ...
766 views
3 days ago
Autoresearch AI Experiment Framework https://github.com/karpathy/autoresearch AutoResearch at Home Distributed Agent ...
32 views
1 view