https://www.youtube.com/watch?v=9-Jl0dxWQs8

AI Alignment forum post from the Deepmind researchers referenced at the video’s start:
https://www.alignmentforum.org/posts/…

Anthropic posts about superposition referenced near the end:
https://transformer-circuits.pub/2022…
https://transformer-circuits.pub/2023…

Some added resources for those interested in learning more about mechanistic interpretability, offered by Neel Nanda

Mechanistic interpretability paper reading list
https://www.alignmentforum.org/posts/…

Getting started in mechanistic interpretability
https://www.neelnanda.io/mechanistic-…

An interactive demo of sparse autoencoders (made by Neuronpedia)
https://www.neuronpedia.org/gemma-sco…

Coding tutorials for mechanistic interpretability (made by ARENA)
https://arena3-chapter1-transformer-i…

Sections:
0:00 – Where facts in LLMs live
2:15 – Quick refresher on transformers
4:39 – Assumptions for our toy example
6:07 – Inside a multilayer perceptron
15:38 – Counting parameters
17:04 – Superposition
21:37 – Up next

If you’re interested in the herculean task of interpreting what these large networks might actually be doing, the Transformer Circuits posts by Anthropic are great. In particular, it was only after reading one of these that I started thinking of the combination of the value and output matrices as being a combined low-rank map from the embedding space to itself, which, at least in my mind, made things much clearer than other sources.
https://transformer-circuits.pub/2021…


An early paper on how directions in embedding spaces have meaning:
https://arxiv.org/pdf/1301.3781.pdf

Timestamps:
0:00 – Who this was made for
0:41 – What are large language models?
7:48 – Where to learn more

This one is a bit more symbol-heavy, and that’s actually the point. The goal here is to represent in somewhat more formal terms the intuition for how backpropagation works in part 3 of the series, hopefully providing some connection between that video and other texts/code that you come across later.

For more on backpropagation:
http://neuralnetworksanddeeplearning….
https://github.com/mnielsen/neural-ne…
http://colah.github.io/posts/2015-08-…

https://colah.github.io/posts/2015-08-Backprop

The following video is sort of an appendix to this one. The main goal with the follow-on video is to show the connection between the visual walkthrough here, and the representation of these “nudges” in terms of partial derivatives that you will find when reading about backpropagation in other resources, like Michael Nielsen’s book or Chis Olah’s blog.

Video timeline:
0:00 – Introduction
0:23 – Recap
3:07 – Intuitive walkthrough example
9:33 – Stochastic gradient descent
12:28 – Final words

To learn more, I highly recommend the book by Michael Nielsen
http://neuralnetworksanddeeplearning….
The book walks through the code behind the example in these videos, which you can find here:
https://github.com/mnielsen/neural-ne…

MNIST database:
http://yann.lecun.com/exdb/mnist/

Also check out Chris Olah’s blog:
http://colah.github.io/
His post on Neural networks and topology is particular beautiful, but honestly all of the stuff there is great.

And if you like that, you’ll love the publications at distill:
https://distill.pub/

What are the neurons, why are there layers, and what is the math underlying it?

Typo correction: At 14 minutes 45 seconds, the last index on the bias vector is n, when it’s supposed to in fact be a k. Thanks for the sharp eyes that caught that!

There are two neat things about this book. First, it’s available for free, so consider joining me in making a donation to Nielsen if you get something out of it. And second, it’s centered around walking through some code and data which you can download yourself, and which covers the same example that I introduce in this video. Yay for active learning!

https://github.com/mnielsen/neural-networks-and-deep-learning

Leaders can’t be afraid to disrupt the status quo, says pharmaceutical CEO Paul Hudson. In conversation with TED’s Lindsay Levin, he shares how AI eliminates “unglamorous work” and speeds up operations while collaborations across competitors can dramatically boost sustainability. Hear some powerful advice for the modern leader — and learn why it’s time for businesses to embrace AI.

Can artificial intelligence be funny, or is comedy a uniquely human trait? In this witty and insightful talk, cartoonist Bob Mankoff explores the art of humor, the evolution of AI and what happens when the two collide. (Recorded at TEDxUofM on Feburary 9, 2024)