rmsnorm
Here are 11 public repositories matching this topic...
Efficient kernel for RMS normalization with fused operations, includes both forward and backward passes, compatibility with PyTorch.
-
Updated
Jun 5, 2024 - Python
Simple and easy to understand PyTorch implementation of Large Language Model (LLM) GPT and LLAMA from scratch with detailed steps. Implemented: Byte-Pair Tokenizer, Rotational Positional Embedding (RoPe), SwishGLU, RMSNorm, Mixture of Experts (MOE). Tested on Taylor Swift song lyrics dataset.
-
Updated
Nov 18, 2024 - Python
A modern minimal LLM implementation made to be easily modified by non-professionals and trained on consumer hardware.
-
Updated
Dec 14, 2025 - Python
Simple character level Transformer
-
Updated
May 27, 2024 - Jupyter Notebook
Generative models nano version for fun. No STOA here, nano first.
-
Updated
Jul 27, 2025 - Jupyter Notebook
A from-scratch PyTorch LLM implementing Sparse Mixture-of-Experts (MoE) with Top-2 gating. Integrates modern Llama-3 components (RMSNorm, SwiGLU, RoPE, GQA) and a custom-coded Byte-Level BPE tokenizer. Pre-trained on a curated corpus of existential & dark philosophical literature.
-
Updated
Dec 1, 2025 - Python
Optimized Fused RMSNorm implementation with CUDA. Features vectorized memory access (float4), warp-level reductions, and efficient backward pass for LLM training
-
Updated
Dec 24, 2025 - Python
🚀 Build your own LLM easily with OpenLabLM, a lightweight, hackable codebase tailored for hobbyists using a single consumer GPU.
-
Updated
Jan 4, 2026 - Python
Improve this page
Add a description, image, and links to the rmsnorm topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the rmsnorm topic, visit your repo's landing page and select "manage topics."