Large Language Models explained briefly
Timestamps:
0:00 – Who this was made for
0:41 – What are large language models?
7:48 – Where to learn more
Timestamps:
0:00 – Who this was made for
0:41 – What are large language models?
7:48 – Where to learn more
Chapters0:00 Introduction1:54 Neural N-Gram Models6:03 Recurrent Neural Networks11:47 LSTM Cells12:22 Outro
If you’re interested in the herculean task of interpreting what these large networks might actually be doing, the Transformer Circuits posts by Anthropic are great. In particular, it was only after reading one of these that I started thinking of the combination of the value and output matrices as being a combined low-rank map from…
This is the last of a series of 3 videos where we demystify Transformer models and explain them with visuals and friendly examples. 00:00 Introduction01:50 What is a transformer?04:35 Generating one word at a time08:59 Sentiment Analysis13:05 Neural Networks18:18 Tokenization19:12 Embeddings25:06 Positional encoding27:54 Attention32:29 Softmax35:48 Architecture of a Transformer39:00 Fine-tuning42:20 Conclusion
https://www.youtube.com/watch?v=9-Jl0dxWQs8 AI Alignment forum post from the Deepmind researchers referenced at the video’s start:https://www.alignmentforum.org/posts/… Anthropic posts about superposition referenced near the end:https://transformer-circuits.pub/2022…https://transformer-circuits.pub/2023… Some added resources for those interested in learning more about mechanistic interpretability, offered by Neel Nanda Mechanistic interpretability paper reading listhttps://www.alignmentforum.org/posts/… Getting started in mechanistic interpretabilityhttps://www.neelnanda.io/mechanistic-… An interactive demo of sparse autoencoders (made…
Abstract: In this talk I’ll highlight several exciting trends in the field of AI and machine learning. Through a combination of improved algorithms and major efficiency improvements in ML-specialized hardware, we are now able to build much more capable, general purpose machine learning systems than ever before. As one example of this, I’ll give an…