🤖 Nano R1 – A Lightweight Reasoning AI Model

Nano R1 is a compact AI project inspired by the architecture of R1, designed to perform basic reasoning tasks using reinforcement learning. This project demonstrates how a minimal model can be trained from scratch to perform decision-making and reasoning using a simplified environment.

🧠 Core Idea

The goal of Nano R1 is to explore how reinforcement learning agents can learn logical or step-by-step reasoning behaviors, even with limited resources and a simplified architecture.

🚀 Features

📚 Trains on GSM8K (Grade School Math 8K) dataset
🧠 Trained using Reinforcement Learning (RL)
🧮 Performs step-by-step reasoning to solve math problems
🔬 Compact model size suitable for low-resource training
📈 Basic reward feedback loop for learning
📦 Modular code structure for ease of experimentation

📚 Tech Stack

🐍 Python
🧠 TensorFlow / PyTorch (whichever you used)
📊 NumPy
🤖 Hugging Face Datasets

base_model: unsloth/Qwen2.5-3B-Instruct-unsloth-bnb-4bit library_name: peft license: apache-2.0 pipeline_tag: text-generation language:

en tags:
unsloth
grpo
trl
transformers
qwen2.5
text-generation-inference
PyTorch
gsm8k

Model Description

Developed by: Jeesan Abbas
License: Apache license 2.0
Finetuned from model: unsloth/Qwen2.5-3B-Instruct-unsloth-bnb-4bit

🙋‍♂️ Author

Made with ❤️ by Jeesan Abbas

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
adapter_config.json		adapter_config.json
added_tokens.json		added_tokens.json
config.json		config.json
generation_config (1).json		generation_config (1).json
gitattributes		gitattributes
merges.txt		merges.txt
special_tokens_map.json		special_tokens_map.json
tokenizer (1).json		tokenizer (1).json
tokenizer_config.json		tokenizer_config.json
vocab.json		vocab.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🤖 Nano R1 – A Lightweight Reasoning AI Model

🧠 Core Idea

🚀 Features

📚 Tech Stack

Model Description

🙋‍♂️ Author

About

Uh oh!

Releases

Packages

DeveloperZeeshu/Nano_R1-model

Folders and files

Latest commit

History

Repository files navigation

🤖 Nano R1 – A Lightweight Reasoning AI Model

🧠 Core Idea

🚀 Features

📚 Tech Stack

Model Description

🙋‍♂️ Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages