Skip to content

From scratch implementation of a small transformers language model inspired by Andrej Karpathy's makemore series! :)

Notifications You must be signed in to change notification settings

godhunter98/nano_transformers

Repository files navigation

Lets build a nano transformer from sratch using pure pytorch

Transformer_Image

There's only one way to deeply understand something - i.e get your hands dirty and start buidling it from the ground up!

In this repo we successfully implement and train a small decoder only transformer - shakspeare GPT.

You'll find 3 main files

- Bigram.ipynb -: The implementation of the transformer
- train.py -: Simply run if you want to train your own model on a dataset
- inferencing_trained_model.ipynb -: If you want to see how my final trained model speaks shakespeare

I followed along sensei 👨‍🏫 Andrej Karpathy and his Neural Networks: Zero to Hero series. I think it's the best resource out there to get started and quickly progress in Deep neural nets.

The model was trained for 50mins on a Nvidia Tesla T4 GPU which is available for free in a colab notebook.

I was able to get the final loss down to 0.976

You can take the model for a test run using the inferencing_trained_model notebook, as I've also included the trained model weights in the repo, or see the some 2K tokens generated by the model in 2k_lines_output.txt

About

From scratch implementation of a small transformers language model inspired by Andrej Karpathy's makemore series! :)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published