🗣️ NeuroLex: Neural Language Modeling Engine

Natural Language Processing (NLP) | Sequence Modeling | Text Generation

💼 Executive Summary

Before Transformers (GPT) revolutionized AI, Recurrent Neural Networks (RNNs) were the state-of-the-art for understanding sequential data.

NeuroLex is a character-level language model designed to predict the probability of the next character in a sequence. By leveraging Long Short-Term Memory (LSTM) networks, this engine solves the "Vanishing Gradient" problem found in vanilla RNNs, allowing it to generate coherent text structures by retaining long-term context (memory).

❓ The Business Problem

Predictive Typing: Mobile keyboards (Autocorrect) need to predict the next word in milliseconds with limited compute.
Context Retention: Simple models forget the beginning of a sentence by the time they reach the end, leading to nonsensical output.
Sequence Dependency: Understanding that "Bank" means something different in "Bank of the river" vs. "Bank of America" requires sequential memory.

💡 The Solution: LSTM Memory Cells

I engineered a sequence model that acts as a "Next-Token Predictor"—the fundamental task behind modern LLMs.

Feature	Technical Implementation	PM Value Proposition
Long-Term Memory	`LSTM Architecture`	Uses "Gating Mechanisms" (Input, Forget, Output gates) to decide what information to keep or discard over long sequences.
Training Stability	`Teacher Forcing`	A training strategy that feeds the actual previous token (ground truth) instead of the predicted one, stabilizing convergence.
Diversity Control	`Temperature Scaling`	A hyperparameter that controls the "creativity" of the output (Low Temp = Deterministic, High Temp = Creative/Random).

Architecture

🔬 Technical Deep Dive (Ablation)

Defining the architecture requires balancing computational cost with memory retention.

Experiment	Configuration	Outcome	Decision
Backbone	`Vanilla RNN` vs `LSTM`	RNN failed to capture dependencies >10 characters back (Vanishing Gradient).	✅ Selected LSTM
Optimization	`Adam` vs `SGD`	Adam converged 3x faster for this sparse text data.	✅ Selected Adam
Regularization	`Weight Tying`	Tying input embedding weights to output weights reduced parameter count by 40%.	✅ Implemented

📊 Output Demonstration

The model was trained on a corpus of text and asked to "continue" a prompt.

Prompt: "The quick brown fox"

NeuroLex Generation (T=0.8): "...jumps over the lazy dog and runs into the deep forest where the sun does not shine. It is a time of great mystery..."

(Note: The model learned grammar, spacing, and sentence structure purely from raw character sequences.)

🛠 Tech Stack

Framework: PyTorch
Architecture: LSTM (Long Short-Term Memory), GRU (Gated Recurrent Unit)
Metric: Perplexity (Lower is better)
Optimization: Backpropagation Through Time (BPTT)

🚀 How to Run the Model

# Clone the repository
git clone [https://github.com/skandvj/HW4P1-Language-Modelling.git](https://github.com/skandvj/HW4P1-Language-Modelling.git)

# Install dependencies
pip install -r requirements.txt

# Train the model
python train.py --model lstm --epochs 20 --batch_size 64

# Generate Text
python generate.py --prompt "The future of AI is" --temperature 0.7

Author

Skand Vijay

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
NeuroLex.png		NeuroLex.png
README.md		README.md
attention.py		attention.py
hw4p1.ipynb		hw4p1.ipynb
test_perplexity.txt		test_perplexity.txt
wandb_export_2024-12-06T21_41_56.819-05_00.csv		wandb_export_2024-12-06T21_41_56.819-05_00.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🗣️ NeuroLex: Neural Language Modeling Engine

💼 Executive Summary

❓ The Business Problem

💡 The Solution: LSTM Memory Cells

Architecture

🔬 Technical Deep Dive (Ablation)

📊 Output Demonstration

🛠 Tech Stack

🚀 How to Run the Model

Author

About

Uh oh!

Releases

Packages

Languages

skandvj/NeuroLex-Sequence-Modeling-Engine

Folders and files

Latest commit

History

Repository files navigation

🗣️ NeuroLex: Neural Language Modeling Engine

💼 Executive Summary

❓ The Business Problem

💡 The Solution: LSTM Memory Cells

Architecture

🔬 Technical Deep Dive (Ablation)

📊 Output Demonstration

🛠 Tech Stack

🚀 How to Run the Model

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages