All
Search
Images
Videos
Shorts
Maps
News
More
Shopping
Flights
Travel
Notebook
Report an inappropriate content
Please select one of the options below.
Not Relevant
Offensive
Adult
Child Sexual Abuse
Length
All
Short (less than 5 minutes)
Medium (5-20 minutes)
Long (more than 20 minutes)
Date
All
Past 24 hours
Past week
Past month
Past year
Resolution
All
Lower than 360p
360p or higher
480p or higher
720p or higher
1080p or higher
Source
All
Dailymotion
Vimeo
Metacafe
Hulu
VEVO
Myspace
MTV
CBS
Fox
CNN
MSN
Price
All
Free
Paid
Clear filters
SafeSearch:
Moderate
Strict
Moderate (default)
Off
Filter
26:10
YouTube
3Blue1Brown
Attention in transformers, step-by-step | Deep Learning Chapter 6
Demystifying attention, the key mechanism inside transformers and LLMs. Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support Special thanks to these supporters: https://www.3blue1brown.com/lessons/attention#thanks An equally valuable form of support is to simply share the videos. Demystifying self ...
3.7M views
Apr 7, 2024
Transformer Acceleration with Dynamic Sparse Attention Transformer Acceleration
0:27
We’ve developed the Sparse Transformer, a deep neural network which sets new records at predicting what comes next in a sequence — whether text, images, or sound. It uses an algorithmic improvement of the attention mechanism to extract patterns from sequences 30x longer than possible previously. Read more: https://openai.com/blog/sparse-transformer/ | OpenAI
Facebook
OpenAI
7.9K views
Apr 23, 2019
1:02
This AI Breakthrough Changes Everything (Sparse Gated Attention) #Shorts
YouTube
CollapsedLatents
2 views
2 weeks ago
7:40
HySparse Hybrid Sparse Attention Architecture with Oracle Token Selection & KV Cache Sharing
YouTube
CosmoX
1 week ago
Top videos
How Attention works in Deep Learning: understanding the attention mechanism in sequence models | AI Summer
theaisummer.com
Nov 19, 2020
Why multi-head self attention works: math, intuitions and 10 1 hidden insights | AI Summer
theaisummer.com
Mar 25, 2021
20:14
Giannis Daras: Improving sparse transformer models for efficient self-attention (spaCy IRL 2019)
YouTube
Explosion
3.2K views
Jul 12, 2019
Transformer Acceleration with Dynamic Sparse Attention Dynamic Sparse Attention
DeepSeek tests “sparse attention” to slash AI processing costs
arstechnica.com
5 months ago
Realistic Dynamic Clouds | Advanced Simulation | SideFX
sidefx.com
Jan 10, 2021
47:52
[DL Math+Efficiency] Rahim Entezari - Fast Video Generation
YouTube
Embedded AI Lab @TUG
1 month ago
How Attention works in Deep Learning: understanding the atten
…
Nov 19, 2020
theaisummer.com
Why multi-head self attention works: math, intuitions and 10 1 hidden in
…
Mar 25, 2021
theaisummer.com
20:14
Giannis Daras: Improving sparse transformer models for efficient s
…
3.2K views
Jul 12, 2019
YouTube
Explosion
40:54
Deep dive - Better Attention layers for Transformer models
15K views
Feb 12, 2024
YouTube
Julien Simon
10:56
Rasa Algorithm Whiteboard - Transformers & Attention 3: Multi
…
59.7K views
May 4, 2020
YouTube
Rasa
11:55
Attention is all you need || Transformers Explained || Quick E
…
23.4K views
Nov 27, 2021
YouTube
Developers Hutt
40:08
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sp
…
6K views
Feb 21, 2025
YouTube
Gabriel Mongaras
15:25
Visual Guide to Transformer Neural Networks - (Episode 2) Multi-Head
…
208.9K views
Dec 8, 2020
YouTube
Hedu AI by Batool Haider
36:05
[Transformer Survey] #2 Sparse Attention
4.1K views
Aug 4, 2021
YouTube
서울대학교 산업공학과 DSBA 연구실
27:07
Attention Approximates Sparse Distributed Memory
7.7K views
Oct 20, 2021
YouTube
MITCBMM
39:24
Intuition Behind Self-Attention Mechanism in Transformer Networ
…
220.7K views
Oct 17, 2020
YouTube
Ark (ark)
26:36
Longformer: The Long-Document Transformer
26.1K views
Apr 20, 2020
YouTube
Yannic Kilcher
16:09
Self-Attention Using Scaled Dot-Product Approach
24.7K views
Mar 28, 2023
YouTube
Machine Learning Studio
9:57
A Dive Into Multihead Attention, Self-Attention and Cross-Attention
62K views
Apr 17, 2023
YouTube
Machine Learning Studio
1:11:53
Lecture 13: Attention
85.2K views
Aug 10, 2020
YouTube
Michigan Online
13:56
Attention is all you need explained
93.4K views
Jan 31, 2023
YouTube
Lucidate
3:29
What are Sparse Transformers?
1K views
Dec 24, 2023
YouTube
What Is It
5:49
Attention Mechanism | Deep Learning
37.7K views
Sep 28, 2020
YouTube
TwinEd Productions
27:07
Attention Is All You Need
762.4K views
Nov 28, 2017
YouTube
Yannic Kilcher
15:01
Illustrated Guide to Transformers Neural Network: A step by step ex
…
1.2M views
Apr 28, 2020
YouTube
The AI Hacker
1:23:24
Self Attention in Transformers | Deep Learning | Simple Explanatio
…
154.3K views
Feb 9, 2024
YouTube
CampusX
Selective attention test examples: videos plus insights
Aug 25, 2019
skillpacks.com
1:03:29
BigBird Research Ep. 1 - Sparse Attention Basics
3.6K views
Apr 12, 2021
YouTube
ChrisMcCormickAI
Sparse Attentive Memory Network for Click-through Rate Prediction
…
Oct 17, 2022
acm.org
5:34
Attention mechanism: Overview
228.2K views
Jun 5, 2023
YouTube
Google Cloud Tech
6:23
You must c C reate an account to continue watching
52K views
Nov 2, 2013
Study.com
Jade Mazarin
BigBird Research Ep. 3 - Block Sparse Attention, ITC vs. ETC
975 views
Apr 22, 2021
YouTube
InnerWorkingsAI
17:48
The Neuroscience of “Attention”
29.7K views
Jun 14, 2022
YouTube
Hedu AI by Batool Haider
29:02
How Attention Got So Efficient [GQA/MLA/DSA]
66.2K views
3 months ago
YouTube
Jia-Bin Huang
See more videos
More like this
Feedback