This is the second of a series of 3 videos where we demystify Transformer models and explain them with visuals and friendly examples.
00:00 Introduction
01:18 Recap: Embeddings and Context
04:46 Similarity
11:09 Attention
20:46 The Keys and Queries Matrices
25:02 The Values Matrix
28:41 Self and Multi-head attention
33:54: Conclusion