frad

A from-scratch PyTorch LLM implementing Sparse Mixture-of-Experts (MoE) with Top-2 gating. Integrates modern Llama-3 components (RMSNorm, SwiGLU, RoPE, GQA) and a custom-coded Byte-Level BPE tokenizer. Pre-trained on a curated corpus of existential & dark philosophical literature.

The execution of the EXIS-MoE project is designed as a clear, sequential four-stage process. After installing the necessary dependencies, the user initiates the project by running dataloader, which automatically curates the specialized philosophical and horror corpus from Project Gutenberg and generates the input.txt file. Once the data pipeline is established, the core learning begins by running train.py. This script executes the custom BPE tokenizer training, constructs the Sparse Mixture-of-Experts (MoE) architecture, optimizes the model over the dataset using AdamW, and saves the resulting weights as frad.pth. Finally, the trained model is loaded via inference.py, which initializes the KV cache and begins the auto-regressive generation loop, enabling interaction with the domain-specialized AI.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data.py		data.py
dataloader.py		dataloader.py
inference.py		inference.py
model.py		model.py
tokenizer.py		tokenizer.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

frad

About

Uh oh!

Releases 3

Packages

Languages

License

ralolooafanxyaiml/frad

Folders and files

Latest commit

History

Repository files navigation

frad

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages