DEV Community

Kaylee Lubick
Kaylee Lubick

Posted on • Edited on

Building a ML Transformer in a Spreadsheet

Attention Is All You Need introduced Transformer models, which have been wildly effective in solving various Machine Learning problems. However, the 10 page paper is incredibly dense. There are so many details, it is difficult to gain high-level insights about how they work and why they are effective.

After several people-months of reading other blog posts about them, the team at Concepts Illuminated understood them well enough to create a Transformer in a spreadsheet and made a video walking through it.

Diagram showing how Transformers have alternating

At a high level, Transformers are effective because they convert the data in a way that can make it easier to find patterns. They build on ideas from Convolutional Neural Networks and Recurrent Neural Networks (Focus and Memory), combining them in something called self-attention.

The video covers these ideas in more details and this is the link to the spreadsheet with the implemented Transformer. Skip to the "Appendix" sheet if you want to see a layer with all the bells and whistles, including multi-headed attention and residual connections.

Implementing the Transformer really helped me understand all the components. I'm especially proud of our metaphor of "scoring points" for explaining self-attention.

Other resources I found useful when researching transformers:

More of our work:

Top comments (0)