Simple and curious tiny GPT model trained on Tiny Shakespeare Dataset.
To develop this Tiny GPT model, I followed an incredible YouTube tutorial by Andrej Karpathy. I also utilized important articles to apply the concepts, such as "Attention is All You Need" and "Language Models are Few-Shot Learners."
For training, the TinyShakespeare dataset (~1115394 characters) was used, with 5000 epochs of training.
The final model had 209,729 trainable parameters and is capable of forming sentences, albeit somewhat disconnected, readable, and well-punctuated.